These web pages assume that you are using Stata Version 9 for Windows on a PC.
If you are NOT using a PC at the Carolina Population Center, you need to copy the example Stata data files to your local PC.
Click here for instructions.
This tutorial is function-oriented, focusing on data-management
tasks. It works up from basic tasks, such as how to drop variables, to
the tasks needed for complex file organization, such as how to reshape
and merge data files.
There is also a section on Analyzing Data from Sample Surveys.
It explains which sampling weight command to use and whether to use svy or robust cluster to adjust for survey design effects.
See Stata Windows environment below for an orientation to the Windows interface.
It also gives you sources of help beyond this tutorial.
Other resources available to help you learn Stata include the
UCLA Stat Computing Portal and Stata Corporation's
Resources for learning Stata.
SAS Users: the SAS User's Guide to Stata may help you make the transition from SAS to Stata.
CPC's highlights of Stata's "What's New?" in the latest version change.
- input: putting data into Stata
- generate: creating a new variable
- list: listing the contents of memory
- save: saving memory in a permanent Stata-format file
- log: capturing the results of Stata commands for printing
- Stata's default actions
- how data are stored in RAM
- clear: clearing Stata's memory
- set memory: allowing enough space for the data
- use: copying the file into memory
- save,replace: saving changes
- describe: names of variables
- summarize: the mean, min, and max of variables
- codebook: more univariate statistics
- tabulate: frequencies and cross-tabulations
- data types and data storage
- if: do command for a subset of observations
- sort: order observations by the values of a variable
- by: do command for groups of observations (requires sort)
- in: do command for a range of observations
- relational, logical, and arithmetic operators
- missing values
- replace: change the values of a variable
- recode: change the values of a variable
- rename: change a variable name
- label: labelling variables, values, and data files
- drop: drop one or more variables
- drop if: drop observations conditional on one or more variables
- edit: editing the data file directly
- do: storing and executing commands in do-files
- #delimit: writing long commands in do-files
- /* */: documenting your do-files
- finding and fixing outliers
- [_n]: finding duplicate ids
- egen: add summary statistics to each observation
- collapse: create file of summary statistics by groups
Combining data files
- reshape long: change variables to observations
- reshape wide: change observations to variables
- histogram with normal curve fitted to it
- graph box plot displayed for two groups
- scatter plot
- twoway scatter plot with regression line
- other resources for learning graphics in Stata 8
Analyzing Data from Sample Surveys
(For a complete discussion of this topic, see
Guidelines for Analyzing Add Health Data by Kim Chantala.
Miscellaneous Tips and Tricks
Authors: Phil Bardsley and Dan Blanchette
Questions or comments? If you are affiliated with the Carolina Population Center, send them to
Phil Bardsley.