Restructuring a SAS data set: from a child file to a mother file
Data set CHILD (1 obs= child):
A child is identified by the pair of variables MID CID, where MID identifies the mother.
MID CID AGE SEX
1 1 2 f
1 2 5 m
2 1 10 m
2 2 12 m
2 3 13 f
3 1 7 f
.
.
.
Data step to create data set MOTHER (1 obs= mother):
First solution
data mother(keep= mid cid1-cid3 age1-age3 sex1-sex3);
set child;
by mid;
length sex1-sex3 $ 1;
retain cid1-cid3 age1-age3 sex1-sex3 i; /* retain variables to be output */
/* and retain pointer variable i */
array c(1:3) cid1-cid3;
array a(1:3) age1-age3;
array s(1:3) sex1-sex3;
if first.mid then do; /* reset retained variables at start of MID BY group */
do j= 1 to 3;
c(j)= .;
a(j)= .;
s(j)= " ";
end;
i= 1;
end;
/* Assign values to output variables from variables in current input observation: */
c(i)= cid;
a(i)= age;
s(i)= sex;
i= i + 1; /* increment i to point to next set of child variables */
if last.mid then output; /* output observation if this is last child of mother */
run;
The output data set MOTHER (1 obs= mother) looks like this:
MID CID1 AGE1 SEX1 CID2 AGE2 SEX2 CID3 AGE3 SEX3
1 1 2 f 2 5 m . .
2 1 10 m 2 12 m 3 13 f
3 1 7 f . . . .
.
.
.
Second solution
Here is a second solution to the above restructuring problem. To produce the mother file, the child file is merged with three copies of itself and with appropriate use of the RENAME= and WHERE= data set options.
data mother(keep= mid cid1-cid3 age1-age3 sex1-sex3);
merge
child(rename=(cid=cid1 age=age1 sex=sex1) where= (cid1 = 1))
child(rename=(cid=cid2 age=age2 sex=sex2) where= (cid2 = 2))
child(rename=(cid=cid3 age=age3 sex=sex3) where= (cid3 = 3))
;
by mid;
run;
Another topic?


