Skip Navigation

UNC Carolina Population Center

 

SAS arrays: What are they and why use them?


What is a SAS array?

A SAS array
  • a set of variables grouped together for the duration of a data step by being given a name in an ARRAY statement
             array pop(1:5) ga sc nc va wv;
    • array name is "pop"
    • the sequence of variable names ga sc nc va wv is the "array list"
    • the variables in the array list are the "array elements"
    • each array element in this example has a position number in the array list from 1 to 5

Convenience of arrays

  • two ways to refer to variables which are array elements
    • variable name
    • array name with subscript pointing to position of variable in the array list
        pop(3) refers to the variable in position # 3 in the array pop (nc in above example) 

Subscripts can be

  • constants
  • variables
  • expressions

Array elements

  • must all be of same type (numeric or character)
  • can be variables which exist or which will be created in the data step


One-dimensional array declarations

array x(1:6) a b c d e f;          x(5) same as e

array x(0:5) a b c d e f; x(4) same as e

array quiz(20) q1-q20; equivalent array declarations
array quiz(1:20) q1-q20; quiz(4) same as q4
array quiz(*) q1-q20; subscript lower bound=1, upper bound=20

array quiz(20); SAS creates quiz1-quiz20 as array elements

array color(1:3) $ 1 character array, elements have length=1 character
red blue green; color(2) same as blue

array pop(1:5) yr95-yr99; pop(2) same as yr96

array pop(95:99) yr95-yr99; pop(96) same as yr96

array x(*) _numeric_; all numeric variables on the observation
array y(*) _character_; all character variables on the observation
array z(*) _all_; all variables on the observation


Two-dimensional array declarations

array quiz(1:4,1:5) q1-q20;        picture 4 rows, 5 columns: 
q1 q2 q3 q4 q5
q6 q7 q8 q9 q10
q11 q12 q13 q14 q15
q16 q17 q18 q19 q20

q(2,4) same as q9

array pop(1:3,98:99) nc98 nc99 va98 va99 sc98 sc99; picture 3 rows, 2 columns:
nc98 nc99
va98 va99
sc98 sc99

pop(3,99) same as sc99


Why use SAS arrays?

  • repeat an action or set of actions on each of a group of variables
  • write shorter programs
  • restructure a SAS data set to change the unit of observation


Simple examples using one-dimensional arrays

1. Recode the set of variables A B C D E F G in the same way: if the variable has a value of 99 recode it to SAS missing.
     array v(7) a b c d e f g;
do k=1 to 7;
if v(k)= 99 then v(k)=.;
end;


2. Each observation of your data set has five variables SEX1 SEX2 SEX3 SEX4 SEX5 which give the sex (1=male, 2=female) of up to 5 persons. You want to count the number of males (MALES) and the number of females (FEMALES) on each observation.

     array sex(1:5) sex1-sex5;
males=0;
females=0;
do i=1 to 5;
if sex(i)=1 then males=males+1;
else if sex(i)=2 then females=females+1;
end;


3. Recode all numeric variables in your data set as follows: if a variable has a value of 98 or 99 recode it to SAS missing.

     array nvar(*) _numeric_;
do i=1 to dim(nvar);
if nvar(i)=98 or nvar(i)=99 then nvar(i)=.;
end;


Using arrays to restructure a SAS data set

For program examples see how to create a child file from a mother file or how to create a mother file from a child file.


Another topic?
Questions or comments?  If you are affiliated with the Carolina Population Center, send them to Phil Bardsley.