You are here: Home / FAQ

Frequently Asked Questions

Questions about Add Health

Questions about Field Work

Questions about Data

General

Sampling and Design Effects

Wave I

Wave II

Wave III

Wave IV

Wave V

Questions about Contracts

General

Adding Researchers or Staff to a Contract

Data Access, Storage, and Security

Wave III Romantic Pairs Data Contract

Questions about Add Health

What is the correct name of the study? 

The correct name of the study is: The National Longitudinal Study of Adolescent to Adult Health. It should be abbreviated as: Add Health. Please see our study logo (in our website banner) for easy reference.

How do I subscribe to the Add Health list server?

Information about Add Health will be distributed through the Add Health list server. To subscribe to the list:

  1.  Send an email to with no message body
  2. Wait for addhealth confirmation email
  3. Click the confirm link in the email

How do I subscribe to the interactive user list server?

If you’d like to communicate with other Add Health users feel free to join our interactive list server.  Add Health supports an interactive list server for data users to exchange information.  In order to post to the interactive list, you must be a member.  To subscribe to the interactive list server, send an email to listserv@unc.edu and type the following in the message in the body of the email: subscribe addhealth2<firstname lastname>

How do I briefly describe the Add Health sample design?

A sample of 80 high schools and 52 middle schools from the US was selected with unequal probability of selection. Incorporating systematic sampling methods and implicit stratification into the Add Health study design ensured this sample is representative of US schools with respect to region of country, urbanicity, school size, school type, and ethnicity.

How do I cite the Add Health research design information found on the Web site?

To reference the research design of Add Health data, please use the following citation:

Harris, K.M., C.T. Halpern, E. Whitsel, J. Hussey, J. Tabor, P. Entzel, and J.R. Udry. 2009. The National Longitudinal Study of Adolescent to Adult Health: Research Design [WWW document]. URL: http://www.cpc.unc.edu/projects/addhealth/design.

What acknowledgment should be included in each written report or other publication based on analysis of data from Add Health?

The Add Health contract and data use agreement require that the following be included:

This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Information on how to obtain the Add Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.

Note: Use of this acknowledgment requires no further permission from the persons named.

 What acknowledgment should be included when using questions from the Add Health survey in my study?

These questions are from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Information on how to obtain the Add Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this project.

Regarding the new NIH Public Access Policy, should we include the Add Health grant when our papers are entered into the NIHMS system?

If you have not received direct support from the Add Health Program Project, please use the following acknowledgment statement to satisfy the requirements of the new NIH Public Access Policy:

This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Information on how to obtain the Add Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.

Regarding the new NIH Public Access Policy, how do I make sure my journal articles are assigned PubMed Central reference numbers (PMCIDs)?

Journal articles published using Add Health data must be submitted to PubMed Central to receive a PMCID.  The method of PubMed Central submission and Investigator responsibility for submission depend on the journal and journal publisher.

  1. Some journals automatically submit published articles to PubMed Central.  For a list of journals that submit articles to PubMed Central please visit the NIH website: http://publicaccess.nih.gov/submit_process_journals.htm.
  2. Some journal publishers may submit the articles to PubMed Central automatically or upon request by the author.  For a list of journal publishers that submit articles to PubMed Central please visit the NIH website: http://publicaccess.nih.gov/select_deposit_publishers.htm#b.
  3. If neither the journal nor the journal publisher will submit the article to PubMed Central, the Investigator will be responsible to submit the final peer-reviewed manuscript to PubMed Central via the NIH Manuscript Submission System (NIHMS).  For detailed instructions on the process of submitting a journal article to PubMed Central, please see the NIH website: http://publicaccess.nih.gov/submit_process.htm.
  4. If you have any problems with this process, please contact the NIHMS or PubMed help desk.

Why do my journal articles need PubMed Central reference numbers (PMCIDs)?

NIH policy requires that "[a]nyone submitting an application, proposal or report to the NIH must include the PMC reference number (PMCID) when citing applicable papers that they author or that arise from their NIH-funded research."

How do I cite the Add Health contractual data?

The recommended citation for the Add Health contractual data set is:

Harris, Kathleen Mullan. 2009. The National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves I & II, 1994–1996; Wave III, 2001–2002; Wave IV, 2007-2009 [machine-readable data file and documentation]. Chapel Hill, NC: Carolina Population Center, University of North Carolina at Chapel Hill. 

How do I obtain a copy of the monograph "Reducing the Risk: Connections That Make a Difference in the Lives of Youth"?

You may view the monograph online or obtain a printed copy by sending an email or letter to:

Reducing the Risk
Adolescent Health Program
University of Minnesota
Box 721, UMHC
420 Delaware Street SE
Minneapolis, MN 55445

Can researchers provide copies of the Add Health data to journal editors who request it?

Add Health adheres to the NIH policy on data sharing but due to the sensitive nature of Add Health data access is limited and governed by the Add Health data management security plan; therefore, authors are unable to provide Add Health data to journal editors. While authors may not provide Add Health data to the editors, they may provide the program code used to construct variables and analyze the data. Editors may obtain a copy of the data under the terms and conditions as described on the Add Health website at: http://www.cpc.unc.edu/projects/addhealth/data.

Do the Add Health data contain any identifiers?

The Add Health data files do not contain respondent identifiers or any links to identifiers.  The data do contain ID numbers which are necessary to allow researchers to link data across the waves and to friends and partners (also de-identified).

Was informed consent required for participation in Add Health?

Add Health participants provided written informed consent for participation in all aspects of Add Health in accordance with the University of North Carolina School of Public Health Institutional Review Board guidelines that are based on the Code of Federal Regulations on the Protection of Human Subjects 45CFR46: http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html.

Examples of these forms are available in the Wave III Documentation and Wave IV Documentation files.

 

Questions about Field Work

Were foster youth included in the Add Health study? If so, how many were surveyed?

In the Add Health Wave I in-home interview, 61 respondents out of 20,745 were identified as living with a foster mother and/or foster father.

How were adolescents identified as eligible for special oversamples for the in-home interview?

An adolescent's answer to a specific question or questions on the In-School Questionnaire determined his or her eligibility for inclusion in an oversample. For example, an adolescent who marked "Chinese" as his or her Asian or Pacific Islander background was eligible for the Chinese oversample. The genetic oversamples were identified in two ways. All adolescents who indicated they were twins were sampled with certainty. When an adolescent indicated at least one other household member in grades 7 through 12 with whom he or she did not share a biological mother and/or biological father, they were added to the pool of potential half-siblings and other, non-related adolescents. Full siblings were not oversampled.

How did the interviewer select the parent to interview for the Parent Questionnaire?

The instructions to the interviewer for selecting the parent to interview for the Parent Questionnaire are detailed below.

Parent Interview

The mother (or other female head of the household) of the originally sampled adolescent will be asked to participate in a 40-minute, interviewer-administered, paper-and-pencil survey regarding health status and behaviors of the adolescent, home environment, and her interpersonal relationships. The parent survey instrument does not contain highly sensitive items about the parent; however, it does ask some sensitive questions about the adolescent. The adolescent's mother (or other female head of the household) is the preferred respondent to complete the questionnaire because, according to the results of previous studies, mothers are generally more familiar than fathers with the schooling, health status, and health behaviors of their children.

Identifying the Parent Survey Respondent

Upon your arrival at the household, ask to speak to the student's mother, the preferred respondent to complete the Parent Questionnaire. If the student's mother does not reside in the household, the appropriate respondent is the first person on the following list who lives with the student:

  • stepmother
  • other female guardian, such as a legal guardian or grandmother
  • father
  • stepfather
  • other male guardian, such as a legal guardian or grandfather

Do not schedule an interview with a male respondent out of convenience. If the mother, stepmother, or other female guardian lives with the adolescent but is unavailable at the time of your visit, ask your household contact for the best time to reach her. If the preferred female resident refuses to be interviewed, the adolescent's father, stepfather, or other male guardian may act as respondent.

What is the retention rate for Add Health?

Due to the design of the study, where Wave I seniors were not selected to be interviewed at Wave II, retention rate is not an appropriate statistic to use to describe Add Health study participation. Any calculated retention rate would be misleading. The response rate at each wave is the best indicator to use.

What are the response rates for each wave?

The response rate for Wave I is 79%.

The response rate for Wave II is 88.6%.

The response rate for Wave III is 77.4%.

The response rate for Wave IV is 80.3%.

Which sections of Wave I were self-administered?

Sections 24-33 were administered using Computer-Assisted Self-Interview (CASI) at Wave I.

Which sections of Wave II were self-administered?

Sections 23-32 were administered using Computer-Assisted Self-Interview (CASI) at Wave II.

Which cases were selected to be re-interviewed at Wave II?

16,706 of the Wave I respondents were selected to be re-interviewed at Wave II. In general, respondents who were seniors at Wave I and were not part of a genetic pair and the disabled sample were not selected to be interviewed at Wave II.

Which cases were fielded at Wave III?

20,745Wave I in-home respondents+    45Wave II only genetic respondents (data for these respondents have not previously been released)-   687Wave I cases without a weight and without a genetic sample flag that were not selected for Wave III______20,103Wave III fielded sample

Which cases were interviewed at Wave III?

15,170Wave I respondents interviewed+    27Wave II only genetic respondents interviewed______15,197Wave III interviewed cases

How many of the interviewed Wave III public-use sample were originally selected for the core and high education black samples?

Wave III public-use sample = 4,882

Core sample only = 4,490

High education black sample only = 325

Both samples = 67

Which sections of Wave III were self-administered?

Sections 16-29 were administered using Computer-Assisted Self-Interview (CASI) at Wave III.

Which cases were fielded at Wave IV?

Out of 20,745 Wave I in-home respondents, 19,962 cases were fielded at Wave IV. The others were determined ineligible.

Which cases were interviewed at Wave IV?

Of the 19,962 fielded cases, 15,701 were interviewed at Wave IV.

Which sections of Wave IV were self-administered?

Sections 15 and 17-24* were administered using Computer-Assisted Self-Interview (CASI) at Wave IV.

*Section 18 was both interviewer and self administered

When there are gender discrepancies between Wave I and Wave IV, how do you know which one is correct?

The Wave IV gender is the correct gender

 

Questions about Data

General

How does the Public-Use dataset compare to the Restricted-Use dataset?

The Add Health Public-Use data can be downloaded from four different sources as listed on the Add Health Public-Use Data page. The Public-Use dataset contains all the data from the In-home Interview, just a smaller sampling. We release a smaller sample through the public-use to limit deductive disclosure risk. Public-Use data doesn’t contain ID numbers of friends, siblings or romantic partners, so the data cannot be linked. The public data also does not contain files on Obesity and Neighborhood Environment, genetics, disposition, political context and alcohol density. These files require a restricted-use contract.

How much space do I need to accommodate all Add Health datasets that are available?

The Add Health data require less than 4 GB of storage space, but you will also need to have space available for software and temp files created by the software, depending on your computing configuration. Additionally, 4 to 8 GB RAM are needed for processing.

Are geocodes included in the restricted-use data?

Due to deductive disclosure concerns, geocodes are not available with the restricted-use data. However, Add Health has established a set of requirements for investigators seeking to add supplemental contextual data to Add Health. A brief introduction to the Ancillary Study proposal process and costs is available here. For more information check Ancillary Studies page. 

What type of geographic data are available?  

Region is the only ‘real’ geographical representation in the Add Health restricted-use data files. Nothing below region is available due to deductive disclosure risks. We have pseudo codes for state, county, census tract, and block group. 

Can the Add Health data be linked to Census information (neighborhood)?

Add Health does not provide geocodes with the restricted-use data which would allow you to add your own Census data. Many Census variables have been linked to the Add Health data. A description of Add Health contextual data and the codebooks are available here. Add Health has established a set of requirements for investigators seeking to add supplemental contextual data to Add Health. A brief introduction to the Ancillary Study proposal process and costs is available hereFor more information check Ancillary Studies page. 

Can supplemental contextual or biological data be added to Add Health?

Add Health has established a set of requirements for investigators seeking to add supplemental contextual or biological data to Add Health, under the auspices of an Add Health ancillary study. An ancillary study is any study that derives support from independent funds outside the Add Health Program Project, and does one or more of the following:

  • Collects new, original questionnaire data on Add Health respondents.
  • Merges secondary data sources onto Add Health respondent or school records and requires personal identifiers (e.g., geocodes) to perform these linkages.
  • Collects new biospecimens from Add Health respondents.
  • Uses archived biospecimens collected by the Add Health study.

For information on ancillary studies, please review the brief introduction to the Ancillary Study proposal process and costs online here. Before considering an ancillary study, review the existing Add Health datasetsFor more information check Ancillary Studies page.

How do I read a SAS export file?

The following SAS commands will allow you to read a SAS export file:

libname in xport '/directory path where file is located/SAS export file name';
data wave1;
set in.SAS dataset name;
run;

For example, the Add Health dataset name on the CD is ALLWAVE1.EXP, the internal SAS dataset name is ALLWAVE1, and your CD drive is D:

libname in xport 'd:allwave1.xpt';
data wave1;
set in.allwave1;
run;

How do I read a SAS export file with STATA?

If you are using a version of STATA before version 13, use the following STATA command to read a SAS export file.

fdause datasetname.xpt

If you are using STATA 13 and later, use the following STATA command to read a SAS export file.

import sasxport datasetname.xpt

If the SAS file is named datasetname.exp, rename the file to datasetname.xpt before running the STATA command.

How do I read a SAS export file with SPSS?

The following SPSS command allows you to read a SAS export file.

GET SAS DATA="\folder\datasetname.xpt".

If the SAS file is named datasetname.exp, rename the file to datasetname.xpt before running the SPSS command.

What numbers should be used for the NIH Inclusion Enrollment Report?

The Wave I inclusion enrollment numbers are available in this downloadable file. In addition, the cumulative inclusion enrollment report can be downloaded here.

What are the codes for anti-hypertensive medications?

The codes for the anti-hypertensive medications are:

'040-047-xxx' 'BETA-BLOCKERS'

'040-049-156' 'THIAZIDE DIURETICS'

'040-042-xxx' 'ACE-INHIBITORS'

'040-043-xxx' 'ANTI-ADRENERGICS (peripherally acting)'

'040-044-xxx' 'ANTI-ADRENERGICS (centrally acting)'

'040-048-xxx' 'CALCIUM CHANNEL BLOCKERS'

'040-053-xxx' 'VASODILATORS'

'040-056-xxx' 'AT2 RECEPTOR BLOCKERS'

'040-055-xxx' 'COMBO ANTI-HYPERTENSIVES'

Anti-hypertensive medication codes used in the following article: Nguyen QC, Tabor JW, Entzel PP, Lau Y, Suchindran C, Hussey JM, Halpern CT, Harris KM, Whitsel EA. Discordance in national estimates of hypertension among young adults. Epidemiology 2011;22(4):532-541.

Sampling and Design Effects

How do I correct for the design effects of the Add Health sampling process?

The online document "Guidelines for Analyzing Add Health Data" discusses how to correct for design effects and the unequal probability of selection to ensure that your analysis results are nationally representative with unbiased estimates.

What variables from the public-use data should be used to correct for design effects?

The online document "Guidelines for Analyzing Add Health Data" refers to variables from the Add Health Restricted-use Data.

How were adolescents identified as eligible for special oversamples for the in-home interview?

An adolescent's answer to a specific question or questions on the In-School Questionnaire determined his or her eligibility for inclusion in an oversample. For example, an adolescent who marked "Chinese" as his or her Asian or Pacific Islander background was eligible for the Chinese oversample. The genetic oversamples were identified in two ways. All adolescents who indicated they were twins were sampled with certainty. When an adolescent indicated at least one other household member in grades 7 through 12 with whom he or she did not share a biological mother and/or biological father, they were added to the pool of potential half-siblings and other, non-related adolescents. Full siblings were not oversampled.

Wave I

How many Wave I in-home respondents have in-school questionnaire data?

15,356 of the Wave I in-home respondents also have in-school data.

What was the response rate for the Wave I school administrator questionnaire?

A total of 132 schools were included in the Add Health Wave I sample. An administrator from each school was asked to complete a questionnaire. The response rate among administrators was 98.5%.

What was the response rate for the Wave I parent questionnaire?

The parent questionnaire response rate was 85.4% for the child-specific data.

What is the best way to compute race in the Add Health Wave I in-home data?

Wave I in-home interview variables* used to construct RACE.

H1GI4Are you of Hispanic or Latino origin?H1GI6AWhat is your race? whiteH1GI6BWhat is your race? black or African AmericanH1GI6CWhat is your race? American Indian or Native AmericanH1GI6DWhat is your race? Asian or Pacific IslanderH1GI6EWhat is your race? other

A single race variable (RACE) was constructed from the six variables listed above. If the respondent answered "yes" to section 1, question 4 (Are you of Hispanic or Latino origin?), that respondent was given a race designation of "Hispanic" and eliminated from any race category that was marked in section 1, question 6 (What is your race?).

In question 6, respondents were able to mark more than one answer, however they were placed in only one race category in the RACE variable. If the respondent marked "black or African American" and any other race, they were designated as black or African American, and eliminated from the other marked categories. The process was repeated for the remaining race categories in the following order: Asian, Native American, other, and white.

* The racial groups for the in-school questionnaire variables are listed in a different order.

Example code:/* Hispanic or Latino, All Races */
if h1gi4=1 then race=1;
/* Black or African American, Non-Hispanic */
else if h1gi6b=1 then race=2;
/* Asian or Pacific Islander, Non-Hispanic */
else if h1gi6d=1 then race=3;
/* American Indian or Native American, Non-Hispanic */
else if h1gi6c=1 then race=4;
/* Other, Non-Hispanic */
else if h1gi6e=1 then race=5;
/* White, Non-Hispanic */
else if h1gi6a=1 then race=6;

The information provided in the program code repository by the Add Health team is a service to the Add Health research community. It is provided "as is" with no guarantees as to suitability for a particular purpose.

What is the best way to compute age in the Add Health Wave I in-home data?

To compute a Wave I age variable with the Wave I data, use the following variables and formula:

IMONTH - Month interview completed
IDAY - Day interview completed
IYEAR - Year interview completed
H1GI1M - What is your birth date? month [and year]
H1GI1Y - What is your birth date? [month and] year

The respondent's age is constructed using the interview completion date and date of birth variables. Because only the month and year of birth are available, 15 is used as the day of birth when calculating age. Consult the Introduction to the Adolescent In-Home Codebook to be sure to take into account the respondents whose birth date and/or interview date is incorrect. Additionally, a few birth dates were corrected during the four waves of data collection so the Wave I date of birth should be compared to the last wave of data for the respondent. The last wave of participation is considered the most correct.

SAS programming code that can be used to construct a Wave I AGE variable using Wave I variables is provided below.

idate=mdy(imonth,iday,iyear); bdate=mdy(h1gi1m,15,h1gi1y); age=int((idate-bdate) / 365.25);

The code to construct Wave I age in Stata is below.

recode h1gilm (96=.), gen (w1bmonth) recode h1gi1y (96=.), gen (w1byear) gen w1bdate = mdy(w1bmonth, 15,1900+w1byear) format w1bdate %d gen w1idate=mdy(imonth, iday,1900+iyear) format w1idate %d gen w1age=int((w1idate-w1bdate)/365.25)

This information is provided by the Add Health team as a service to the Add Health research community. It is provided "as is" with no guarantees as to suitability for a particular purpose.

What does the Wave I variable COMMID represent?

The COMMID variable groups together the respondents who attend the high school and feeder school that make up the 7 - 12 grade span for the strata.

Why are there 1,821 respondents without a Grand Sample Weight at Wave I?

The following Wave I cases could not be weighted:

  • Cases added in the field.
  • Cases selected as a pair (twins, half-sibs) where both were not interviewed.
  • Cases without a sample flag.
  • Respondents from schools outside of the 80 strata.

Wave II

What was the response rate for the Wave II school administrator questionnaire?

A total of 132 schools were included in the Add Health Wave II sample. An administrator from each school was asked to complete a questionnaire. The response rate among administrators was 87.0%.

How do I code gender changes between Wave I and Wave II?

When there is a discrepancy between the Wave I and Wave II gender of a respondent, use the Wave II gender. The restricted-use data include 23 cases in which the Wave I variable BIO_SEX and the Wave II variable BIO_SEX2 do not match. The Wave II data have been confirmed as correct. Wave II includes 7 cases in which the variable SEXFLG2 equals 1. This indicates that the incorrect gender was used to control the questionnaire skips during the interview. The variable BIO_SEX2 was corrected, but answers to questions based on gender will be incorrect.

Wave III

Where can I find the monograph about biomarkers collected in Wave III?

"Biomarkers in Wave III of the Add Health Study" is available in pdf format. The monograph outlines relevant procedures, design, and sampling schemes used in the collection of biomarker data, and serves as a user guide for analysis and interpretation.

How can I obtain a copy of the first release of the Education Data?

The restricted-use Education Data, collected by the Adolescent Health and Academic Achievement Study, are available through the Add Health contract. For users who already have a contract, contact to request an order form for the Education Data. A copy of the public-use version of the file can be downloaded from the ICPSR website.

When there are gender discrepancies between Wave I and Wave III, how do I know which one is correct?

There are 20 cases in which Wave III gender (BIO_SEX3) does not match the Wave I gender (BIO_SEX). At Wave III, the preloaded gender variable came from the last wave of available data. Eighteen of these inconsistent cases match the Wave II gender (BIO_SEX2) and were confirmed at Wave III as being correct. Of the remaining two inconsistent cases:

In one case the Wave III gender, female, was confirmed by the Add Health security manager as being correct.

In the last case, both the Wave I and Wave II gender are listed as male, which is correct. For this case only, the Wave III gender is incorrect.

When I calculate a Wave III respondent age using the birth date (month, 15, year) and date of interview, I do not get the same age for some respondents as the one found in variable CALCAGE3. Why does this happen?

The age calculated during the interview uses the actual day of birth, which is not released with the Add Health data. During the Wave III interview, the age of the respondent was calculated by the computer interviewing program and then verified by the respondent. The discrepancy in the ages occurs when a respondent is interviewed during his or her birth month.

How many of the interviewed Wave III public-use sample were originally selected for the core and high education black samples?

Wave III public-use sample = 4,882

Core sample only = 4,490

High education black sample only = 325

Both samples = 67

Am I allowed to include quotes from the open-ended question on how a mentor helped the young person, asked in Wave III. Am I allowed to put quotes, in my dissertation?

You may not include any open-ended responses in your dissertation as that is a frequency of 1 which is not allowed.

            “In no table should all cases in any row or column be found in a single cell.”

Wave IV

How many of the interviewed Wave IV public-use sample were originally selected for the core and high education black samples?

Wave IV public-use sample = 5,114

Core sample only = 4,699

High education black sample only = 345

Both samples = 70

How do I recode the variables for Wave IV, Section 21: Criminal Offending and Victimization?
If your Wave IV in-home interview file is dated before March 2012 it will need to be updated with the following:

The Wave IV, Section 21: Criminal Offending and Victimization variables H4DS13 – H4DS20 contain implausible values for some respondents that need to be recoded. To correct these implausible values, 1) you may request an updated Wave IV data file that contains the recoded values, or 2) you may use the SAS program, Recode Wave IV, Section 21_SAS.pdf, or the Stata program, Recode Wave IV, Section 21_Stata.pdf, with your original Wave IV data file to recode these variables. The program will make the following transformations to the data:

  • Temporarily recode values of 6, 8, and 9 for variables H4DS1 –H4DS20 to missing so that these values are not included in the sum of the variables constructed in items 2 and 3.
  • Create a variable, VAR1, that is the sum of variables H4DS1 – H4DS11.
  • Create another variable, VAR2, that is the sum of variables H4DS13 – H4DS20.
  • Recode variables H4DS13 – H4DS20 to missing when VAR1 = 0 and VAR2 = 8.
  • There will now be 1,424 observations in the restricted-use data file with a value of missing for variables H4DS13 – H4DS20. The public-use file will have 442 observations with a value of missing.

Wave V

When will the full Wave V survey data be released?

We anticipate disseminating the full Wave V survey data in 2019.  Please join our listserv in order to receive the data release announcements.  

 

Questions about Contracts

General

Where can I find the contract or data user agreement?

Go to CPC Data Portal. Download and complete the Restricted Use Data contract under Add Health DUA requirements.

What are the requirements to be eligible to apply for a contract?

Investigators must meet the following criteria:

  • A. Have a PhD or other terminal degree; and
  • B. Hold a faculty appointment or research position at Institution

Institution must meet the following criteria:

  • A. Be an institution of higher education, a research organization, or a government agency
  • B. Have a demonstrated record of using sensitive data according to commonly accepted standards of research ethics

Who has the authority to sign as Institutional Representative?

  • Must be someone not on the contract in any other role.
  • Must be someone able to enter a legal agreement on behalf of your institution. 
  • Must be someone who works at the Office of Sponsored Research or Contracts office.  

How do I apply for Add Health data?

To apply for restricted-use data or Romantic Pairs data, go to CPC Data Portal.

What is the processing fee for Add Health data?

For new applicants, there is a one-time processing fee of a $1000 for accessing Wave I, II, III, IV and V data.  There is a separate fee for linking the data to any GWAS data obtained from dbGaP. 

How long is the approval process?

 Add Health Roadmap shows you the process and timeline of an Add Health Contract approval.

How can I pay for the Add Health data?

Payment can be by credit card, check or money order (your personal check or from your institution).

Go to CPC Data Portal. Download and complete Investigator Information Page, Add Health will upload an invoice with instructions.

When would I receive the Add Health data?

After all the requirements are submitted and approved. The data will be uploaded to your contract account within the CPC data Portal for you to download. The Add Health Roadmap shows you the process and timeline of an Add Health Contract approval. 

Is Add Health data regulated by HIPPA?

The UNC IRB has confirmed that, at this time, Add Health Waves I-V data is NOT considered Protected Health Information (PHI) and therefore is not regulated by HIPAA. 

Add Health does NOT disseminate any Personally identifiable information (PII) data via the restricted use contracts, so restricted use contract holders will not receive PII. 

Are the Add Health data de-identified?

The Add Health data are de-identified.  Personal identifying information such as names, addresses, and social security numbers are not connected to the survey responses and biomarker data because the data are de‑identified Add Health data is not considered PII.

Will you list my publications and presentations on Add Health’s website?

Absolutely. Any results produced from analysis using the Public-Use data or the Restricted-Use data are eligible for posting. Please the complete reference for your publication or presentation. For publications, please include the email address of the first author.

I don’t wish to renew my contract. What do I need to do to terminate it?

To terminate your Contract, download and complete the contract termination form and email to .

Where do I send the CDs when I terminate my contract?

Return CDs to Add Health Contracts by UPS or FedEx with tracking number and signature required to:

Add Health Contracts 
Carolina Population Center
UNC-Chapel Hill
Carolina Square, Suite 210 
123 West Franklin Street 
Chapel Hill, NC 27516

Adding Researchers or Staff to a Contract

How do I add researchers, collaborators, officemates, or information technology staff to my contract?

Add Health now processes all request through the CPC Data Portal. (If you have Add Health forms on file, please discard them, as the Portal will always have the most current forms). To add researchers to a contract, go to CPC Data Portal 

  • Log In (Right upper corner) and go to “Applications tab”
  • Click the “User List” button
  • Fill out all the fields (Last name, First Name, role, email (optional), Access Location)
  • Click “Add” button
  • Click on contract ID “xxxxxxxxx” to go back to the requirements page
  • Go to the requirement: Add Health Supplemental Agreement

    * If this requirement had been previously approved, you will see "Click here to add more documents"
    * Download the blank form  
    * Upload the completed form(s)

  • Go to the requirement: Add Health Security Pledge

    * If this requirement had been previously approved, you will see "Click here to add more documents"
    * Download the blank form  
    * Upload the completed form(s)

What is the procedure for adding Information Technology Staff who will have access to the data but will not use the data for analysis?

Add Health now processes all request through the CPC Data Portal. (If you have Add Health forms on file, please discard them, as the Portal will always have the most current forms). To add a Information Technology Staff to a contract, go to CPC Data Portal

Go to the requirement: Add Health Security Pledge

  • If this requirement had been previously approved, you will see "Click here to add more documents"
  • Download the blank form  
  • Upload the completed form(s)

Data Access, Storage, and Security

Can I save my temporary data analysis files after I terminate the contract and return the data CDs?

No. You can create a constructed variable file that contains only the variables that you’ve created with no original data components. This variable file does not have to be deleted every six months. This file should be sent to Add Health upon contract termination and we will securely store the CD for 3 years, or until a new contract is established.

How can we consolidate the data storage and security administration for two Add Health contracts at the same institution?

  • One researcher (R1) decides to be responsible for the data.
  • The non-responsible researcher (R2) will return his/her copy of the data, submit a final annual report, and terminate the contract.
  • R2 and data users listed in the terminated contract will be added to R1's contract as supplemental researchers.
  • R1 (or a systems admin) trains the new users about data access and security for the new contract. All users from R2's contract must be able to have access to the one copy of the data from R1's contract.

If there are many users located in different buildings, it’s helpful if the institution’s computing is centralized so that all accounts are on one server.

What are the requirements to request Wave IV Ambient Air Pollutants Data?

The following items are required before this request can be approved. Please submit, as necessary, the documents listed below.

  • Data Analysis Plan (maximum one page; indicate time for completion)
  • Completed Affiliate Form (will be provided upon receipt of this order form) - Due to the sensitive of this data approved users can only access the data remotely on the Add Health Linux Server
  • Remote Access Form

Can we receive the Wave IV Ambient Air Pollutants data on a CD, and we will use it in our cold room?

Due to the sensitive nature of the Ambient Air Pollutants data, approved users can only access the data on the Add Health Linux Server for a limited timeframe (6 weeks).

Wave III Romantic Pairs Data Contract

What’s the difference between the standard data contract and the Romantic Pairs contract?

The Romantic Pairs data are available through a separate contract, which is available upon request. The main differences between the standard contract and the Romantic Pairs contract are the renewal schedule, security plan requirements, and access to the data:

  • The Romantic Pairs contract must be renewed every two years, while the standard contract is on a three-year renewal cycle. Renewal of the Romantic Pairs contract requires an annual report, an updated Data Security plan, and Institutional Signatures.
  • There are several security plan options for housing the standard contract data, but the Romantic Pairs data must be housed on a separate, stand-alone computer.
  • Only one person may access the Romantic Pairs data, while the standard contract allows more flexibility. If there are other researchers interested in working with the Romantic Pairs data, they must apply for separate contracts.

Can we put the Romantic Pairs data on the computer that researchers currently use to analyze the Restricted-Use data?

No. Only one user can work with these data; it is not possible to store the Romantic Pairs data on a computer that is accessible to multiple users. The Romantic Pairs data are the most sensitive and restrictive data that we disseminate. Initially, users were required to use these data in the Secure Data Facility (cold room) at the Carolina Population Center, but in order to allow researchers easier access, we developed a modified plan that would approximate the security of the cold room at the researcher's institution. To maintain the security of the files, it is important that the requirements are strictly followed.

I am currently working with the Restricted-Use data and would like to get the 1,507 Romantic Pairs sample in Wave III. Are they included in the data that we already have?

The Wave III Romantic Pairs data are not part of the Restricted-Use data. These data are the most sensitive data that we disseminate and are only available through a separate, single user contract that requires additional security (stand-alone computer in a locked office, used only by one researcher). Your current PI (or another PhD researcher) would need to enter into a separate contract in order to get the Romantic Pairs data.

It appears we can not have access to both romantic pairs and HIV data sets simultaneously - is there any particular reason for it?

Deductive disclosure becomes an even greater issue when the two data files are linked.  The file becomes a “couple” file, which creates more unique data.  This is the risk that Add Health is protecting.

The romantic partner and HIV are two different data sets, correct? 

With an HIV contract includes Add Health respondent data and the HIV results data. The additional interviews with the Wave III partners of the Add Health respondents are not included. 

The romantic partner data does not contain information on HIV, is this correct?

With the Wave III Romantic Pairs contract you get the Wave III Add Health respondent interview data, the Wave III partner interview data, and a file that allows you to link the partners.  The HIV results data are not included with this contract, however, the results of the other STIs are available for both the Add Health respondent and the partner with the Romantic Pairs contract.

Going to the CPC Data Portal, when I add the HIV data to the cart - I do not see it there, only the romantic pairs data. Is this data inaccessible?

That is correct.  Because of the deductive disclosure risk and the sensitive nature of the data information on how to get the HIV data is provided by request only.

If it is not possible to have the romantic pairs and HIV data together, how do we study the relationship between romantic pairs and HIV? 

That type of analyses is not possible with the Add Health data.