Creating a household level variable from person level data

 

A person level data set derived from U.S. Census data

 

  • H_SEQ is the household sequence number
  • PERSON is the person number within the household
  • each observation is a person
  • the original data set consists of 132,324 persons from the 50,785 households interviewed in the Current Population Survey of March 1999.
H_SEQ     PERSON   AGE    MARITL  SEX      HGA       RACE     PERRP     

 1         1       31     1       1        39        1         1
 1         2       36     1       2        44        1         3
 1         3        7     7       1         0        1         4
 1         4        2     7       2         0        1         4

 2         1       31     7       2        39        1         1
 2         2       49     5       2        39        1         6
 2         3       10     7       1         0        1         4

 3         1       50     1       1        43        1         1
 3         2       52     1       2        44        1         3
 3         3        8     7       2         0        1         4
 3         4       10     7       1         0        1         4
 3         5       46     6       2        35        1        12

 4         1       71     1       1        45        1         1
 4         2       68     1       2        43        1         3

 5         1       67     4       2        31        1         2

      .
      .
      .
      .


The data step below creates a household level variable MEMLT18 from the person data shown above. The value of MEMLT18 will be the number of household members less than 18 years of age. A household level temporary SAS data set named hh is written consisting of two variables: H_SEQ and MEMLT18. The person level data set which is input to the data step is a permanent SAS data set named percps99.

  libname in "C:\data\class01";

  data hh(keep= h_seq memlt18);
   set in.percps99(keep= h_seq age);
   by h_seq;

   retain memlt18 0; * initiate memlt18 to zero *;
   if first.h_seq then memlt18= 0; * reset memlt18 for each by-group *;
   if age < 18 then memlt18= memlt18+1; * add up memlt18 *;
   if last.h_seq then output;

   label memlt18= "hh members < 18 yrs";

  run;

 

 


Continue with BY groups?

 

Another topic?

Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme