The problem of deductive disclosure of an individual respondent's identity is a major concern of federal agencies and researchers. In essence, deductive disclosure is the discerning of an individual respondent's identity and responses through the use of known characteristics of that individual. This is not unique to Add Health—a person who is known to have participated in ANY survey may be identified by a combination of personal characteristics, allowing identification of that person's record. For example, in the Add Health in-school dataset of more than 90,000 cases, a cross-tabulation of five variables can distinguish an individual record.
The Add Health data are more sensitive than many other datasets to deductive disclosure. This is due, in part, to the clustered research design. Add Health surveyed all students in grades 7 through 12 in a pair of schools in each of 80 US communities; more than 120,000 students were enrolled in these schools. Parents were informed by letter prior to the administration date via students and post. Assuming that most students live with two other people, 360,000 people know of the participation of at least one, if not many, of the adolescents attending the selected schools. Additionally, approximately 5,000 school administrators, staff, and teachers were involved in the in-school data collection efforts.
The in-home selection process increased the number of people aware of Add Health: about 5,000 participants in the in-home component had not completed an In-School Questionnaire. (Participation in the in-school session was not a prerequisite for eligibility, only the presence of an adolescent's name on the school enrollment roster.)
Given the large number of people who know someone who, they know, participated in Add Health, researchers who use the Add Health contractual dataset are obligated to protect respondents from deductive disclosure risk by taking extraordinary precautions to protect the data from unauthorized use. Precautions include, but are not limited to, the following:
- copying the original dataset only once and storing the original CD-ROM in a locked drawer or file cabinet
- saving the computer programs used to construct analysis data files, but not the data files themselves
- retrieving paper printouts immediately upon output
- shredding printouts no longer in use
- password protecting Add Health data
- signing pledges of confidentiality
- agreeing to use the data solely for statistical reporting and analysis