Quality Control
The RLMS project aims to provide timely and accurate information on
Russia. Thus, the RLMS has invested many of its resources in enhancing
the quality of the data. The project team is the first to note that
Phase I data do not match the standards of data that the group has
collected in other countries; nevertheless, RLMS data are of
exceptionally high quality and are very useful in the context of research on the transition in Russia.
Interviewing
Phase I
One area of concern is interviewer training. It is not unusual in
Russia for this training to consist solely of a short lecture, with no
practice. The training program for Phase I of the RLMS was the first of
its kind to last for more than a few hours. One-week training sessions
were held three times in preparation for Round I. One of the sessions
was solely for supervisors and most interviewers were trained twice.
During the next year, two more supervisor training sessions were held.
Interviewer training was carried out in at least six different sites
per round. High interviewer turnover during the first two rounds was
reduced considerably as interviewers began to understand the
expectations and close monitoring of the project staff. The project
prepared video tapes for each round of data collection, including
formal instruction in how to interview, examples of well-done
interviews, and specific training for the RLMS.
In spite of these efforts, there were considerable problems
with the RLMS training in Phase I. The Russian Goskomstat interviewers
who participated in Round I training were not those who actually
collected the data in about half of the oblasts. In Rounds II and III,
the project was able to train the actual interviewers, supervisors, and
data entry personnel. For Round IV, Goskomstat officials at the time
did not allow any training.
A second area of concern is monitoring and evaluation. During
the fall of 1992, the project conducted the first review of sample
implementation ever conducted in a Russian setting. The project's
collaborators at the Institute of Sociology of the Russian Academy of
Science (ISRosAN) conducted elaborate checks of the way the sample was
drawn in each oblast, including direct checks of lists and households.
They then discussed the sampling procedures in detail with workers in
each oblast. In doing so, they found that a quota system was used in
two oblasts. Fortunately, this was done only to replace non-respondents
and was not done for the entire sample. Essentially, non-random
procedures were used to fill quotas in only four voting districts, and
then only for replacing non-respondents. Michael Swafford and Michael
Kosolapov and his group (ISRosAN) continued to monitor sampling
implementation. They included extensive training sessions on sampling
for Goskomstat supervisors in March and April 1992, consultation by
telephone with most oblast staff during the implementation phase,
reviews of difficult issues and problems uncovered during the training
program, and subsequent revisions of some sampling plans.
A third area of concern is the interviews themselves. The
field supervisor is responsible for the interviews in his or her
region. As a check of the quality of the work of the interviewers,
ISRosAN staff contacted and re-interviewed a random subset of
households. For Round I, ISRosAN revisited and re-interviewed hundreds
of households in eight oblasts and 24 voting districts. Remarkably,
this review showed that all families interviewed by ISRosAN staff had
previously been interviewed by Goskomstat interviewers.
Phase II
In Phase II, it was the responsibility of local supervisors to gather
the necessary information for sampling in accordance with written
instructions, to arrange for training facilities, to invite people to
be trained, to supervise their work, and to check the completed
questionnaires. All local supervisors consulted by telephone with
representatives in Moscow who could answer their questions in advance.
All interviewers underwent a demanding training regimen,
outlined below. Any trainee whose performance during training revealed
him or her to be unsuited for the job was dismissed before field work
began.
- Lectured on the general principles of face-to-face interviewing. We
provided a 70-minute video tape entitled "Introduction to Interviewing"
to insure that all interviewers received the same instructions and
examples. Where there was no available VCR, we rented video salons.
- Required interviewers to read through the entire questionnaire in advance, then to fill out the questionnaire themselves.
- Showed interviewers an example of a good interview with
commentary, again using a video tape. The tape include a section on the
diet portion of the questionnaire.
- Introduced them to the written questionnaire specifications, entitled "Interviewer Instructions."
- Played the role of respondent while trainees took turns reading questions as they would in an actual interview.
- Had the interviewers practice interviewing in groups of three.
One assumed the role of interviewer; another, the role of respondent;
the third, the role of observer, watching to see whether the
interviewer was working properly. The trainer and perhaps some other
experienced interviewers circulated among the triads to observe.
- Gave the interviewers written exercises that tested their
ability to react properly to certain difficult situations in
administering the questionnaire.
- Reviewed the administrative procedures pertaining to the survey.
- Gave the trainees practice in persuading respondents to participate by having them role play.
- Required interviewers to complete at least one practice
interview with a household that was not in the sample--preferably not a
household related to them, although they were allowed to practice with
relatives first.
- Examined their work after each of their first three interviews or more, until they demonstrated that they were competent.
Data Entry
Phase I
At first, SPSS-DE was employed to reduce clerical error. Data were
entered twice, and then the records were compared. Two training
programs were held (the first for two weeks, the second for one week)
for Goskomstat data entry personnel and their supervisors. UNC-CH
programmers along with ISRosAN and RCPM programmers took part in this
activity. Problems were found in data entry for Round I. SPSS-DE
software required that files be separated into many subfiles, thus
increasing the risk of error; the Russian computers used in addition to
the project's 386's for data entry could not replicate the picture of
each page of the questionnaire on the screen as designed; and the
initial menu-driven diet entry program was too slow. The project went
to great lengths to address all these concerns. New Russian-language
data entry software was specifically prepared for Russian hardware
requiring only one record per questionnaire.
In addition, we found that direct entry of the dietary data
led to extensive delays during the cleaning process and did not allow
for adequate editing. For Rounds III and IV the project used a more
traditional system in which dietary data were hand-edited and coded
before data entry.
The project also discovered problems with identification
numbers. It seems that the same individual identification numbers were
used for different respondents in different rounds of survey work. The
Goskomstat interviewers and supervisors did not utilize the rosters
from previous rounds. It also took additional time for Goskomstat staff
to realize the existence and importance of this problem. This led to
considerable delay due to additional cleaning time for Round II. A
system instituted for Phase II addresses this and many other reasons
for delays.
As with data entry, the project encountered difficulties with
data cleaning which were rectified only after Round I was completed and
the senior Russian collaborators from Goskomstat and RCPM visited
UNC-CH. The initial suggestion was to use statistical cleaning
techniques, as is traditional in Russia. Subsequently, an agreement was
reached to adopt a more thorough approach that includes checking
problematic codes against original questionnaires, checking all
identification numbers, using checks of subsamples as a guide toward
more detailed data checks, and checks on the data collection and entry
work.
Phase II
In Phase II, when questionnaires were returned to local supervisors,
those supervisors were required to examine them to locate problems that
could best be remedied in the field, e.g., by returning to get key
demographic information or cleaning ID numbers so that the roster of
individuals located in the household questionnaire matched those on the
individual questionnaires from that household. The questionnaires were
then transported to Moscow, where yet another ID check was performed.
In Moscow, coders looked through all questionnaires to code
so-called "other: specify" responses. However, open-ended questions
(e.g., occupation questions) were not coded at this time. Instead,
their texts were fully entered as long string variables.
Entering the open-ended answers
as character variables offers several advantages. First, it allows data
entry to begin immediately, with no delay for coding. Second, it
permits the use of computer programs to assist in coding the string
variables. Third, this method allows any user of the original data sets
to recode the character variables to suit his or her purposes without
going back to the paper copies of the questionnaires.
All data entry was handled in-house using the SPSS data entry
program on PCs. For the first survey of Phase II, Round V, the first
pass of data entry began on December 20, 1994, and finished on February
1, 1995. The second (verification) pass overlapped with the first to
speed up the process. It began on January 15, 1995, and was completed
on February 8, 1995 (with the exception of the diet data). The second
pass revealed an error rate of 1% in each pass. Rounds VI-XII used a
similar timeframe.