Looping over variables and values

 

We often need to run the same command for a large number of variables. For example, we might want to change the value 9 to missing for 200 variables in a data file. We can type the recode command 200 times, or we can type the foreach command once and let it create 200 copies of recode for us.

The foreach and forvalues commands are convenient ways to save you typing. In addition to the above recode situation, they can also rename a group of variables for us, saving us typing many rename commands. In fact, any command, or set of commands, that you need to repeat over a group of variables, is a likely candidate for one of these labor-saving commands.

Two examples of foreach and forvalues are shown below.

To see an engaging discussion of the topic by Nicholas Cox of the University of Durham, UK, see this PDF file: How To Face Lists With Fortitude. This article was also published in Stata Journal 2(2), 2002.


clear


use "q:\utilities\statatut\examfac2.dta"


su q102_* q103_*
su q102_* q103_*

/* Replace all values of 99 with missing in q102_* and q103_* */

foreach x of varlist q102_* q103_* {
   replace `x'= .  if `x' == 99
}


/* Rename q102_* to title* and rename q103_* to fphour* */

forval x=1/20 {
   rename q102_`x' title`x'
   rename q103_`x' fphour`x'
}
exit


TIP:   To see how each pass through the loop is resolved, do the following.

clear
use "q:\utilities\statatut\examfac2.dta"
set tracedepth 1   // only show one level down
set trace on  // turn on Stata's trace option

foreach x of varlist q102_* q103_* {
   replace `x'= . if `x' == 99
}
set trace off  // return to normal Stata mode

 


Questions:

1. What is the "*" in the foreach and forvalues commands? Answer.


2. What purpose does the word "varlist" serve in the foreach command? Answer.


3. What is the "x" in each of the commands? Answer.


4. The foreach and forval commands have a couple characters I haven't seen before in this tutorial: {} and `. What are they? Answer.


5. Why does the forval command take up more than one line? Answer.

 


Answers:

 

1. The asterisk (*) after a variable name is a shortcut in Stata. You can use it in any Stata command, not just this one. It tells Stata to look for all variable names that begin with "q102_" and "q103_" and end with anything. We know that they end with the numbers 1-20. It's the same as typing all 40 variable names:

foreach x of varlist q102_1 q102_2 q102_3 ... q103_19 q103_20 { 

Back to question 

 


 

2. The word "varlist" in the foreach command tells Stata that we are referring to a list of existing variables. The foreach command has other options, such as "newlist" for generating a list of new variables.

 Back to question 

 


 

3. The "x" is itself a variable that stands for each variable in the foreach variable list, or each value in the forvalues number list. In the first example, the command following the "{" is repeated once for each variable in the variable list, substituting the real variable name for the "x" in the replace command. In the second example, "x" is substituted for each number in the list 1/20. Note that it need not be "x" - any variable name will do, but "x" is quick and easy to type.

 

The net result of the foreach command is the same as typing the following 40 times:

replace q102_1=. if q102_1==99
replace q102_2=. if q102_2==99
  etc.

Back to question 

 


 

4. The braces {} in the foreach and forval commands surround the commands that you want to execute for each variable. In the foreach command there is one command (replace). But in the forval command we have inserted two commands between the braces. The other character is an accent mark. On a standard US keyboard it is on the same key with the tilda (~) on the left side next to the 1 key. This character tells Stata that the character following is a special kind of variable known as a "local macro" in Stata. A macro (local or global) is temporary and does not become part of the data in memory. A macro must be surrounded by an accent on the left and an apostrophe (also called a "single quote") on the right like this: `m'

Back to question 

 


 

5. Stata requires separate lines for each part of these commands. Here is how Stata's help lays out the syntax rules:

  • the open brace must appear on the same line as "foreach" or "forvalues"
  • nothing may follow the open brace except, of course, comments
  • the first command to be executed must appear on a new line
  • the close brace must appear on a line by itself

Alternatively, we can change the command delimiter to semicolon (;) and put the entire command on one line like this:

   #delimit ;
   forval x=1/20 {; rename q102_`x' title`x'; rename q103_`x' fphour`x'; };

Frankly, that's pretty hard to read. Using separate lines and indentation looks much nicer. In fact, if you copy the above lines into your do-file editor and run them, you'll see in the Results window that Stata will improve your code by using separate lines and indentation - at least your log will be easy to read!

 Back to question 

 


 

Review again?

 

Another topic?


Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme