Photo Bar
 
Print This Page

Questions about Data

How do I read a SAS export file?

The following SAS commands will allow you to read a SAS export file:
   libname in xport '/directory path where file is located/SAS export file name';
   data wave1;
       set in.SAS dataset name;
   run;

For example, the Add Health dataset name on the CD is ALLWAVE1.EXP, the internal SAS dataset name is ALLWAVE1, and your CD drive is D:

libname in xport 'd:allwave1.xpt'; data wave1; set in.allwave1; run;

[top]

How do I read a SAS export file with STATA?

Using STATA 8.1 you can use the following STATA command to read a SAS export file.

fdause datasetname.xpt
If the SAS file is named datasetname.exp, rename the file to datasetname.xpt before running the STATA command.

[top]

How do I read a SAS export file with SPSS?

The following SPSS command allows you to read a SAS export file.
GET SAS DATA="\folder\datasetname.xpt".
If the SAS file is named datasetname.exp, rename the file to datasetname.xpt before running the SPSS command.

[top]

How were adolescents identified as eligible for special oversamples for the in-home interview?

An adolescent's answer to a specific question or questions on the In-School Questionnaire determined his or her eligibility for inclusion in an oversample. For example, an adolescent who marked "Chinese" as his or her Asian or Pacific Islander background was eligible for the Chinese oversample. The genetic oversamples were identified in two ways. All adolescents who indicated they were twins were sampled with certainty. When an adolescent indicated at least one other household member in grades 7 through 12 with whom he or she did not share a biological mother and/or biological father, they were added to the pool of potential half-siblings and other, non-related adolescents. Full siblings were not oversampled.

[top]

How many of the interviewed Wave III public-use sample were originally selected for the core and high education black samples?

Wave III public-use sample = 4,882
Core sample only = 4,490
High education black sample only = 325
Both samples = 67

[top]

How do I code gender changes between Wave I and Wave II?

When there is a discrepancy between the Wave I and Wave II gender of a respondent, use the Wave II gender. The restricted-use data include 23 cases in which the Wave I variable BIO_SEX and the Wave II variable BIO_SEX2 do not match. The Wave II data have been confirmed as correct. Wave II includes 7 cases in which the variable SEXFLG2 equals 1. This indicates that the incorrect gender was used to control the questionnaire skips during the interview. The variable BIO_SEX2 was corrected, but answers to questions based on gender will be incorrect.

[top]

When there are gender discrepancies, how do I know which one is correct?

There are 20 cases in which Wave III gender (BIO_SEX3) does not match the Wave I gender (BIO_SEX). At Wave III, the preloaded gender variable came from the last wave of available data. Eighteen of these inconsistent cases match the Wave II gender (BIO_SEX2) and were confirmed at Wave III as being correct. Of the remaining two inconsistent cases:

In one case the Wave III gender, female, was confirmed by the Add Health security manager as being correct.

In the last case, both the Wave I and Wave II gender are listed as male, which is correct. For this case only, the Wave III gender is incorrect.


[top]

What is the best way to compute age in the Add Health Wave I in-home data?

The best way to compute age in the Add Health Wave I data is to use the following variables and formula:

IMONTH - Month interview completed
IDAY - Day interview completed
IYEAR - Year interview completed
H1GI1M - What is your birth date? month [and year]
H1GI1Y - What is your birth date? [month and] year

The respondent's age is constructed using the interview completion date and date of birth variables. Because only the month and year of birth are available, 15 is used as the day of birth when calculating age. Please consult the Introduction to the Adolescent In-Home Codebook to be sure to take into account the respondents whose birth date and/or interview date is incorrect.

The SAS programming code used to construct the AGE variable is provided below.

idate=mdy(imonth,iday,iyear);
bdate=mdy(h1gi1m,15,h1gi1y); age=int((idate-bdate) / 365.25);

This information is provided by the Add Health team as a service to the Add Health research community. It is provided "as is" with no guarantees as to suitability for a particular purpose.


[top]

When I calculate a Wave III respondent age using the birth date (month, 15, year) and date of interview, I do not get the same age for some respondents as the one found in variable CALCAGE3. Why does this happen?

The age calculated during the interview uses the actual day of birth, which is not released with the Add Health data. During the Wave III interview, the age of the respondent was calculated by the computer interviewing program and then verified by the respondent. The discrepancy in the ages occurs when a respondent is interviewed during his or her birth month.

[top]

How do I correct for the design effects of the Add Health sampling process?

The paper "Strategies to Perform a Design-Based Analysis Using the Add Health Data" discusses how to correct for design effects and the unequal probability of selection to ensure that your analysis results are nationally representative with unbiased estimates.

[top]

The paper "Strategies to Perform a Design-Based Analysis Using the Add Health Data" refers to variables from the Add Health restricted-use data. What variables from the public-use dataset should be used to correct for design effects?

Variables for Correcting for Design Effects in the Public-Use Dataset
Design Type = With Replacement
Unit = Adolescent

Wave I 
N = 6504
Wave II
N = 4834*
Strata Variable
--------- #
--------- #
Cluster Variable
MEX50197
MEX50197
Weight Variable
GSWGT1
GSWGT2
# With Weights
6504
4834
# Missing Weights
0
0
Mean of Weights
3422.6630
3892.7001
Sum of Weights
22261000.000
18817312.465
Minimum Weight Value
256.0588
282.4469
Maximum Weight Value
1835.4864
21107.1003

* These numbers are based on individual datasets, not combined datasets.
# A strata variable is not available; not using a strata variable only minimally affects the standard errors.


[top]

Where can I find the monograph about biomarkers collected in Wave III?

"Biomarkers in Wave III of the Add Health Study" is available here in pdf format. The monograph outlines relevant procedures, design, and sampling schemes used in the collection of biomarker data, and serves as a user guide for analysis and interpretation.

[top]

How many Wave I in-home respondents have in-school questionnaire data?

15,356 of the Wave I in-home respondents also have in-school data.

[top]

What does the Wave I variable COMMID represent?

The COMMID variable groups together the respondents who attend the  high school and feeder school that make up the 7 - 12 grade span for the strata. 

[top]

CPC Home
© Copyright 2003, Add Health
Page Last Modified: 09/03/2007
Login