Changing data values

 

Replacing values, recoding variables, editing data, labelling data.

If you haven't read about relational, logical, and arithmetic operators in the previous page on Groups and Subsets of Data, click here to read a brief summary.

We'll continue using the 1999 Tanzania Facility Survey data.


       clear
       use "q:\utilities\statatut\exampfac.dta"

    /* as a precaution, make a copy of the variable you want to recode */

       gen factype2=factype

    /* change 8 values of factype2 into 4 */

       replace factype2= 1 if factype2 >= 2 & factype2 <= 4
       replace factype2= 2 if factype2 == 5
       replace factype2= 3 if factype2 == 6
       replace factype2= 4 if factype2 == 7 | factype2 == 9 

    /* the recode command does this more efficiently */

    /* recode factype2 2/4=1 5=2 6=3 7 9=4 */

    /* change a variable's name */

       rename factype2 type

    /* give the new variable some labels */

       label variable type "Recoded facility type"
    /* Click here for more on the label variable command **/

       label define fac 1 "Hospital" 2 "HealthCenter" 3 "Dispensary" 4 "Other"
       label values type fac
    /* Click here for more on the label value command **/

    /* the data file already has a label, but this is how it was done: */

       label data "1999 Tanzania Facility Survey"

    /* delete the original factype variable */

       drop factype

    /* delete observations with missing facility type */

       drop if missing(type)
       browse
       edit

 

Questions:

1. The replace command changes specific values in an existing variable. What would happen if you reversed the order of these replace commands:

       replace factype2= 4 if factype2 == 7 | factype2 == 9 
       replace factype2= 3 if factype2 == 6
       replace factype2= 2 if factype2 == 5
       replace factype2= 1 if factype2 >= 2 & factype2 <= 4

Answer.


2. What happens to missing values of factype2? Answer.


3. How many variable names can you change with one rename command? Answer.


4. The label values command assigns a set of labels to the values of one variable. If I have several variables needing the same value labels, can I define one label and assign it to all the variables? Answer.


5. How can I change existing variable and value labels? Answer.


6. What if I want to drop most of my variables? That's a lot of typing! Answer.


7. What is the difference between browse and edit? Answer.

 

 


Answers:

 

1. All observations would have a value of 1 for factype2. Remember that each Stata command is executed for every observation before the next command is executed. The recode command makes all these changes in a single command, so the order of the changes doesn't matter.

Back to question. 

 


 

2. We didn't change the missing values in either the replace or recode examples, so they remain missing.

Back to question. 

 


 

3. Only one. Later we'll see how to rename or make just about any other change to a lot of variables easily using the foreach command.

Back to question. 

 


 

4. Yes, for example the last 10 variables, pill-natural, have a yes/no answer format:

       label def yesno 1 "Yes" 2 "No"
       label val pill yesno
       label val inject yesno
       etc.
       label val natural yesno

Back to question.

 

 


 

5. To change a variable label, simply type a new label var statement:

       label var factype "Type of facility"

To change a value label, you have two choices:

  • drop the existing label and recreate it with the changes, or
  • create a new label and reassign the variable to the new label.

The describe command shows you which variables have value labels attached to them and the names of the labels. Using that command, we see that the variable urbrur has the label urb. Here's how to drop urb and recreate it with new values:

       label drop urb   
       label def urb 1 "RURAL" 2 "URBAN" 3 "MIXED"
       label val urbrur urb

Back to question.

 

 


 

6. Sometimes it's easier to use the keep or the keep if command than the drop or drop if command.

Back to question.

 

 


 

7. Both commands display the data in spreadsheet format. The edit command allows you to make changes directly in the data, just as you would in a spreadsheet. The browse command allows you to view the data but not to make changes.

 

Note: the edit command does make entries in the Stata log, but they are not a very useful record of the changes you've made. The best record to keep of data editing uses a case identifier, but the Edit command uses _n, which is relative to the current sort order of the data in memory. Because Edit leaves you with no record of what you've done, we recommend that you never use it.

 

Back to question.

 


 

Review again?

 

Another topic?


Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme