Changing data values
Replacing values, recoding variables, editing data, labelling data.
If you haven't read about relational, logical, and arithmetic operators in the previous page on Groups and Subsets of Data, click here to read a brief summary.
We'll continue using the 1999 Tanzania Facility Survey data.
clear use "q:\utilities\statatut\exampfac.dta" /* as a precaution, make a copy of the variable you want to recode */ gen factype2=factype /* change 8 values of factype2 into 4 */ replace factype2= 1 if factype2 >= 2 & factype2 <= 4 replace factype2= 2 if factype2 == 5 replace factype2= 3 if factype2 == 6 replace factype2= 4 if factype2 == 7 | factype2 == 9 /* the recode command does this more efficiently */ /* recode factype2 2/4=1 5=2 6=3 7 9=4 */ /* change a variable's name */ rename factype2 type /* give the new variable some labels */ label variable type "Recoded facility type" /* Click here for more on the label variable command **/ label define fac 1 "Hospital" 2 "HealthCenter" 3 "Dispensary" 4 "Other" label values type fac /* Click here for more on the label value command **/ /* the data file already has a label, but this is how it was done: */ label data "1999 Tanzania Facility Survey" /* delete the original factype variable */ drop factype /* delete observations with missing facility type */ drop if missing(type) browse edit
replace factype2= 4 if factype2 == 7 | factype2 == 9 replace factype2= 3 if factype2 == 6 replace factype2= 2 if factype2 == 5 replace factype2= 1 if factype2 >= 2 & factype2 <= 4
2. What happens to missing values of factype2? Answer.
3. How many variable names can you change with one rename command? Answer.
4. The label values command assigns a set of labels to the values of one variable. If I have several variables needing the same value labels, can I define one label and assign it to all the variables? Answer.
5. How can I change existing variable and value labels? Answer.
6. What if I want to drop most of my variables? That's a lot of typing! Answer.
7. What is the difference between browse and edit? Answer.
1. All observations would have a value of 1 for factype2. Remember that each Stata command is executed for every observation before the next command is executed. The recode command makes all these changes in a single command, so the order of the changes doesn't matter.
2. We didn't change the missing values in either the replace or recode examples, so they remain missing.
label def yesno 1 "Yes" 2 "No" label val pill yesno label val inject yesno etc. label val natural yesno
label var factype "Type of facility"
To change a value label, you have two choices:
- drop the existing label and recreate it with the changes, or
- create a new label and reassign the variable to the new label.
The describe command shows you which variables have value labels attached to them and the names of the labels. Using that command, we see that the variable urbrur has the label urb. Here's how to drop urb and recreate it with new values:
label drop urb label def urb 1 "RURAL" 2 "URBAN" 3 "MIXED" label val urbrur urb
7. Both commands display the data in spreadsheet format. The edit command allows you to make changes directly in the data, just as you would in a spreadsheet. The browse command allows you to view the data but not to make changes.
Note: the edit command does make entries in the Stata log, but they are not a very useful record of the changes you've made. The best record to keep of data editing uses a case identifier, but the Edit command uses _n, which is relative to the current sort order of the data in memory. Because Edit leaves you with no record of what you've done, we recommend that you never use it.