Creating a household level variable from person level data
A person level data set derived from U.S. Census data
- H_SEQ is the household sequence number
- PERSON is the person number within the household
- each observation is a person
- the original data set consists of 132,324 persons from the 50,785 households interviewed in the Current Population Survey of March 1999.
H_SEQ PERSON AGE MARITL SEX HGA RACE PERRP
1 1 31 1 1 39 1 1
1 2 36 1 2 44 1 3
1 3 7 7 1 0 1 4
1 4 2 7 2 0 1 4
2 1 31 7 2 39 1 1
2 2 49 5 2 39 1 6
2 3 10 7 1 0 1 4
3 1 50 1 1 43 1 1
3 2 52 1 2 44 1 3
3 3 8 7 2 0 1 4
3 4 10 7 1 0 1 4
3 5 46 6 2 35 1 12
4 1 71 1 1 45 1 1
4 2 68 1 2 43 1 3
5 1 67 4 2 31 1 2
.
.
.
.
The data step below creates a household level variable MEMLT18 from the person data shown above.
The value of MEMLT18 will be the number of household members less than 18 years of age.
A household level temporary SAS data set named hh is written consisting of two variables:
H_SEQ and MEMLT18.
The person level data set which is input to the data step is a permanent SAS data set named percps99.
libname in "C:\data\class01"; data hh(keep= h_seq memlt18); set in.percps99(keep= h_seq age); by h_seq; retain memlt18 0; * initiate memlt18 to zero *; if first.h_seq then memlt18= 0; * reset memlt18 for each by-group *; if age < 18 then memlt18= memlt18+1; * add up memlt18 *; if last.h_seq then output; label memlt18= "hh members < 18 yrs"; run;
Continue with BY groups?
Another topic?


