Restructuring a SAS data set: from a child file to a mother file

Data set CHILD (1 obs= child):

A child is identified by the pair of variables MID CID, where MID identifies the mother.

             MID      CID    AGE    SEX

              1        1      2      f 
              1        2      5      m 
              2        1     10      m 
              2        2     12      m 
              2        3     13      f 
              3        1      7      f 
                  .
                  .
                  .



Data step to create data set MOTHER (1 obs= mother):

 

First solution

data mother(keep= mid cid1-cid3 age1-age3 sex1-sex3);
     set child;
     by mid;

     length sex1-sex3 $ 1;

     retain cid1-cid3 age1-age3 sex1-sex3 i; /* retain variables to be output */ 
                                             /* and retain pointer variable i */

     array c(1:3) cid1-cid3;
     array a(1:3) age1-age3;
     array s(1:3) sex1-sex3;

     if first.mid then do; /* reset retained variables at start of MID BY group */
                          do j= 1 to 3;
                            c(j)= .;
                            a(j)= .;
                            s(j)= "  ";
                          end;
                          i= 1;
                       end;

     /* Assign values to output variables from variables in current input observation: */

     c(i)= cid;
     a(i)= age;
     s(i)= sex;

     i= i + 1; /* increment i to point to next set of child variables */

     if last.mid then output; /* output observation if this is last child of mother */
run;


The output data set MOTHER (1 obs= mother) looks like this:

            MID    CID1    AGE1    SEX1    CID2    AGE2    SEX2    CID3    AGE3    SEX3

             1      1       2        f       2       5       m       .       .         
             2      1      10        m       2      12       m       3      13       f  
             3      1       7        f       .       .               .       .         
                .
                . 
                .

 

Second solution

Here is a second solution to the above restructuring problem. To produce the mother file, the child file is merged with three copies of itself and with appropriate use of the RENAME= and WHERE= data set options.

 

       

data mother(keep= mid cid1-cid3 age1-age3 sex1-sex3);
     merge
          child(rename=(cid=cid1 age=age1 sex=sex1) where= (cid1 = 1))
          child(rename=(cid=cid2 age=age2 sex=sex2) where= (cid2 = 2))
          child(rename=(cid=cid3 age=age3 sex=sex3) where= (cid3 = 3))
          ;
     by mid;
run;

 

 


Another topic?

Wink Plone Theme by Quintagroup © 2013.

Personal tools
This is themeComment for Wink theme