You are here: Home / Retained variables

# Retained variables

### What is a retained variable?

A retained variable

• is not automatically set to missing before the next iteration of the data step
• value is retained from the current iteration to the beginning of the next iteration of the data step
• any change in value is controlled by the programmer

The RETAIN statement

• names the variables which should not be set to missing before the next iteration of the data step (the "retained variables")
• may give initial values (for first iteration of data step)
• non-executable (can be placed anywhere in the data step)
• examples:
```      retain x y i . ; * initiate all vars to missing *;

retain x1-x5 0 name 'alice'; * initiate vars x1-x5 to zero *;
* and initiate variable name to 'alice' *;

retain x1-x5 (0 1 2 3 4); * initiate x1 to zero, x2 to one *;
* x3 to two, x4 to three, and x5 to four *;```

Retained variables are important especially in working with grouped observations. First we'll examine the concept with a simple example.

### Example of a data step with a retained variable

```     data alpha;
input a b c;
retain runtot 0; /* runtot will keep a running total of a, b, c */
runtot= runtot + (a + b + c);
datalines;
2 4 6
3 1 5
0 7 9
8 5 4
;
run;

proc print noobs;
run;```

The output from proc print above would look like this:
```      A    B    C  RUNTOT

2    4    6  12
3    1    5  21
0    7    9  37
8    5    4  54```

### How a retained variable behaves during data step execution

To understand the values of the retained variable RUNTOT on the output observations, picture the Program Data Vector (PDV) during data step execution as follows.

Before `input a b c;` is executed the first time:

```        A      B     C   RUNTOT
-------------------------
|     |     |     |     |
PDV   |     |     |     |  0  |
|     |     |     |     |
-------------------------```
After `input a b c;` is executed the first time:
```        A      B     C   RUNTOT
---------------------------
|     |     |     |     |
PDV   |  2  |  4  |  6  |  0  |
|     |     |     |     |
---------------------------```
After `runtot= runtot + (a + b + c);` is executed the first time:
```        A      B     C   RUNTOT
-------------------------
|     |     |     |     |
PDV   |  2  |  4  |  6  | 12  |              1st obs output:  A  B  C RUNTOT
|     |     |     |     |                               2  4  6  12
-------------------------```
After `input a b c;` is executed the second time:
```        A      B     C   RUNTOT
-------------------------
|     |     |     |     |
PDV   |  3  |  1  |  5  | 12  |
|     |     |     |     |
-------------------------```
After `runtot= runtot + (a + b + c );` is executed the second time:
```        A      B     C   RUNTOT
-------------------------
|     |     |     |     |
PDV   |  3  |  1  |  5  | 21  |              2nd obs output:  A  B  C  RUNTOT
|     |     |     |     |                               3  1  5   21
-------------------------```

Question: What would be the result if you omitted the RETAIN statement in the data step above?

Answer: `RUNTOT` would just be the value of just  `. + a + b + c` for the current observation which would make `RUNTOT` equal to missing for all observations.

Another topic?

Wink Plone Theme by Quintagroup © 2013.

##### Personal tools
This is themeComment for Wink theme