Sampling Overview
Even in the best circumstances, drawing a good sample of an entire
country is a challenging exercise. Russia presents some of the most
difficult circumstances: the territory is vast (spanning 11 time zones
and covering more than one-tenth of the land mass of the world), the
population is ethnically heterogeneous, and the residential patterns
are complex. For example, a large fraction of the population--up to 10
or 12 percent--lives in dormitories or communal apartments. Many of the
census statistics that Western samplers take for granted are
inaccessible or non-existent.
Compounding these problems is the fact that survey research has
a weak tradition in Russia. For ideological reasons, social research
was severely restricted until recently. Many powerful local authorities
prohibited surveys of the adult population in their regions out of fear
that surveys would bring problems to the attention of their superiors.
Thus, in spite of Russia's strong tradition in mathematical statistics,
virtually no one has had the opportunity to grapple seriously with the
practical challenges of survey work. Consequently, with the exception
of so-called "micro-censuses" conducted periodically by Goskomstat to
update census results, there have been virtually no probability samples
of large areas of Russia (or the Soviet Union) until very recently.
Since large-scale independent surveys became permissible in 1989, many
organizations have claimed to have drawn representative samples of
Russia. Unfortunately, sample documentation is invariably inadequate
and quotas are often invoked in the attempt to make the demographic
characteristics of the sample correspond to those of the most recent
census, under the mistaken impression that such correspondence attests
to the high quality of the sample. The RLMS represents the first
nationally representative random sample for Russia, albeit a highly
clustered one.
Phase I (Rounds I - IV)
The goal was to develop a sample of households (excluding
institutionalized people) that would meet accepted scientific standards
of a true probability sample to the greatest extent possible, while
taking into account the severe operational constraints of Goskomstat.
With the advice of William Kalsbeek (a sampling expert at UNC-CH) and
later with help from Leslie Kish, the project developed a replicated
three-stratified cluster sample of residential addresses excluding
military, penal, and other institutionalized populations. Replication
was designated for Stage 1 of sampling so that the number of primary
sampling units (PSUs) could be kept manageable, with the understanding
that later they would be expanded. The sample size of each replicate
was set at 20 PSUs. The quality of this sample has been statistically analyzed.
From the outset, the sampling team thought an ideal sample would
include more than 20 PSUs. Given the logistical problems Goskomstat had
to overcome, however, 20 PSUs turned out to be the maximum number that
could be done well. Doubling the number of PSUs would have led to the
degradation of other equally important determinants of quality. There
is a high degree of clustering, the effects of which are well known.
Using a relatively small number of PSUs enlarges standard errors,
although it does not bias results. However, the sample is not as
inefficient as it might seem. Stratification was employed in the
selection of PSUs, and this stratification took advantage of
considerable unpublished data from Goskomstat. The 20 PSUs embrace as
much variability as possible, far more than would have been captured in
a simple random sample of regions (raions). Also, the clustering can
provide important research benefits, for it may actually enhance
potential ancillary data collection (e.g., media monitoring) as well as
comparative analyses of different sites.
In Stage 1, the 2,335 official regions (raions) were implicitly
stratified according to 10 quality-of-life regions and percent urban.
Proportional-to-size sampling (PPS) was then used to select PSUs.
Moscow and St. Petersburg were selected with certainty as
self-representing units, as is commonplace in such surveys.
In Stage 2, voting districts within each PSU were ordered
according to size (and, in cities, according to relation to the city
center), and PPS procedures were used to select 10 districts in each
PSU. This yielded 200 secondary sampling units (SSUs).
In Stage 3, a list of all household addresses in each SSU was
compiled, where "household" was defined as a group of people living
together and sharing income and expenditures. Adjustments were made to
take into account single addresses at which several households lived
(e.g., adult dormitories and communal apartments). Using an appropriate
interval and a randomly selected starting point in the list of
households, 36 households were selected systematically in each of the
200 SSUs, yielding a sample of 7,200 households. This number was
selected with the expectation that, even after four rounds, at least
5,000 households would remain in the study. In fact, the project
achieved a higher response rate than this.
Of the 7,200 targeted households, 6,334 provided data for Round
I (17,154 individuals, of which 4,148 are aged 55 and older). This is a
response rate of 88.8 percent. An additional 40 households (or less
than 1 percent of the sample) refused to participate in Round II
interviews, while a number of Round I refusals agreed to be surveyed
for Round II. A variety of approaches have been used to reduce
subsequent loss to follow-up (including honoraria to respondents and
training interviewers to be courteous and respectful).
Rounds II-IV of the RLMS returned to the same addresses.
Following addresses has significant practical advantages. In
particular, since interviewers have no discretion in the selection of
households to be part of the survey, they cannot reduce their workload
by claiming that households are lost to follow-up. Also, independent
checks of interviewers' work are easier to institute.
Preliminary results on sampling: It is instructive to compare
the demographic attributes of Phase I of the RLMS sample of individuals
with those of the 1989 Russian census. In the following table, the age
distribution of the RLMS sample compares favorably to that determined
by the Soviet census four years earlier. Gender and education almost
match those found in the census.
Age Distribution
| Age
| 1989 Census
| RLMS Sample
3.5 Years Later
|
| 00-09
| 15.6%
| 13.4%
|
| 10-19
| 14.2
| 14.4
|
| 20-29
| 14.1
| 11.8
|
| 30-39
| 17.0
| 16.8
|
| 40-49
| 10.7
| 12.8
|
| 50-59
| 12.2
| 12.6
|
| 60-69
| 9.8
| 11.8
|
| 70+
| 6.4
| 6.4
|
| Total
| 100%
| 100%
|
Finally, consider the distribution of respondents by nationality
(ethnicity), keeping in mind that the RLMS sample was not designed to
represent all ethnic groups. It is statistically unlikely that small
ethnic groups (of which there are more than 100 in Russia) will be
proportionally represented in a national survey that is not designed
specifically to represent them. It is likely, however, that large
national groups will be proportionally represented. That is precisely
what is observed for the distribution of respondents across the six
largest nationality groups.
Nationality
| 1989 Census
| RLMS Sample
3.5 Years Later
|
| Russian
| 81.5%
| 82.7%
|
| Tatar
| 3.8
| 3.1
|
| Ukrainian
| 3.0
| 2.5
|
| Chuvashi
| 1.2
| 0.3
|
| Bashkir
| 0.9
| 0.4
|
| Byelorussian
| 0.8
| 0.7
|
| Others
| 8.8
| 10.3
|
| Total
| 100.0%
| 100.0%
|
From the outset, nationality (ethnicity) was considered a very
salient issue. Indeed, for Round I, a battery of 15 questions about
ethnic identity and language, fashioned after those in the Soviet
Interview Project, was proposed. This battery of questions was
vigorously opposed by Goskomstat officials who considered it too
sensitive for a government survey, especially since the survey
contained a PSU in a territory threatening secession. Rather than risk
the entire survey over this issue, the project settled for a single
question on ethnic identity that allowed respondents to volunteer more
than one nationality if they wished. More detailed questions on
ethnicity were asked in later rounds of the survey.
On the basis of census figures on nationality and language,
the project anticipated some problems with delivering interviews only
in Russian. The project was prepared to develop multiple language
questionnaires, but colleagues in three Russian institutions insisted
that it would be unnecessary. This proved to be right. In the first
place, the two PSUs that fell in areas with high concentrations of
non-Russians were known to have high concentrations of Russian
speakers. Second, the questionnaire consisted primarily of questions of
fact with lists of everyday nouns, not of opinion items where nuances
make a crucial difference. Therefore, Russian-as-a-second-language was
quite sufficient. The project acknowledges that, if the sample were
expanded to represent ethnic enclaves, the questionnaire language would
surely pose a greater problem than it does presently.
A related issue concerns the reaction of Muslim women to the
abortion, pregnancy history, and other sensitive sections of the
survey. To some extent, it is important to note that Muslim women in
Russia do not react as do those where Islam has been freely practiced.
In point of fact, there was no appreciable difference in the
willingness of these women to continue in the study, and the response
rate for questions is the same. The loss-to-follow-up rate for Round I
to Round II was 5.6 percent. This is based on the total set of families
who were interviewed in each oblast and the number interviewed who were
included in the original Round I sample. The rates for the two oblasts
with large Muslim populations, Kazan and Nal'chik, are provided as an
example. Kazan oblast had a loss-to-follow-up rate of 3.4 percent and
Nal'chik had a nonresponse rate of 1.9 percent.
Phase II (Rounds V - XIII)
In Phase II of the RLMS, a multi-stage probability sample was employed. Please refer to the March 1997 review
of the Phase II sample. First, a list of 2,029 consolidated raions was
created to serve as primary sampling units (PSUs). These were allocated
into 38 strata based largely on geographical factors and level of
urbanization, but also based on ethnicity where there was salient
variability. As in many national surveys involving face-to-face
interviews, some remote areas were eliminated to contain costs; also,
Chechnya was eliminated due to armed conflict. From among the remaining
1,850 raions (containing 95.6% of the population), three very large
population units were selected with certainty: Moscow city, Moscow
Oblast, and St. Petersburg city constituted self-representing (SR)
strata. The remaining non-self-representing raions (NSR) were allocated
to 35 equal-sized strata. One raion was then selected from each NSR
stratum using the method "probability proportional to size" (PPS). That
is, the probability that a raion in a given NSR stratum was selected
was directly proportional to its measure of population size.
The NSR strata all have approximately equal sizes because they
were purposefully designed that way to improve the efficiency of
estimates. The target population (omitting the deliberate exclusions
described above) numbers over 140 million inhabitants. Ideally, one
would use the population of eligible households, not the population of
individuals. As is often the case, we were obliged to use figures on
the population of individuals as a surrogate because of the
unavailability of household figures in various regions.
Although the target sample size was set at 4,000, the number of
households drawn into the sample was inflated to 4,718 to allow for a
non-response rate of approximately 15%. The number of households drawn
from each of the NSR strata was approximately equal (averaging 108),
since the strata were of approximately equal size and PPS was employed
to draw the PSUs in each one. However, because we expected response
rates to be higher in urban areas than in rural areas, the extent of
over-sampling varies. This accounts for the differences in households
drawn across the NSR PSUs. It also accounts for the fact that 940
households were drawn in the three SR strata--more than the 14.6%
(i.e., 689) that would have been allotted based on strict
proportionality.
Since there was no consolidated list of households or dwellings
in any of the 38 selected PSUs, an intermediate stage of selection was
then introduced, as usual. Professional samplers will recognize that
this is actually the first stage of selection in the three SR strata,
since those units were selected with certainty. That is, technically,
in Moscow, St. Petersburg, and Moscow Oblast, the census enumeration
districts are the PSUs. However, it is cumbersome to keep making this
distinction throughout the description, and we shall follow the normal
practice of using the terms "PSU" and "SSU" loosely. Needless to say,
in the calculation of design effects, where the distinction is
critical, we have maintained the proper distinction. The selection of
second-stage units (SSUs) differed depending on whether the population
was urban (located in cities and "villages of the city type," known as
"PGTs") or rural (located in villages). That is, within each selected
PSU the population was stratified into urban and rural substrata, and
the target sample size was allocated proportionately to the two
substrata. For example, if 40% of the population in a given region was
rural, 40 of the 100 households allotted to the stratum were drawn from
villages.
In rural areas of the selected PSUs, a list of all villages was
compiled to serve as SSUs. The list was ordered by size and (where
salient) by ethnic composition. PPS was employed to select one village
for each ten households allocated to the rural substratum. Again, under
the standard principles of PPS, once the required number of villages
was selected, an equal number of households in the sample (10) was
allocated to each village. Since villages maintain very reliable lists
of households, in each selected village the 10 households were selected
systematically from the household list. In a few cases, villages were
judged to be too small to sustain independent interviews with 10
households; in such cases, 3 or 4 tiny villages were treated as a
single SSU for sampling purposes.
In urban areas, SSUs were defined by the boundaries of 1989
census enumeration districts, if possible. If the necessary information
was not available, 1994 microcensus enumeration districts, voting
districts, or residential postal zones were employed--in decreasing
order of preference. Since census enumeration districts were originally
designed to be roughly equal in population size, one district was
selected systematically without using PPS for each 10 households
required in the sample. In the few cases where postal zones were used,
one zone was likewise selected systematically for each 10 households.
However, where voting districts were used, to compensate for the marked
variation in population size, PPS was employed to select one voting
district for each 10 households required in the urban sub-stratum.
Given the lack of reliable official lists of households within the urban
SSUs, we were obliged to develop the list of households from which ten
households were selected. First, a list of dwellings was made. Where
more than one household was known to exist within a single dwelling
(that is, in the communal apartments and enterprise dormitories that
are relatively commonplace in the Russian Federation), the list was
amended so that each household (or space within the dwelling) was
enumerated in advance of selection. Then, the required number of
households was drawn systematically, starting with a random selection
in the first interval.
In both urban and rural substrata, interviewers were required
to visit each selected dwelling up to three times to secure the
interviews. They were not allowed to make substitutions of any sort.
The interviewers' first task was to identify households at the
designated dwellings. "Household" was defined as a group of people who
live together in a given domicile, and who share common income and
expenditures. Households were also defined to include unmarried
children, eighteen years of age or younger, who were temporarily
residing outside the domicile at the time of the survey. If perchance
the interviewer identified more than one household in the dwelling, he
or she was obliged to select one using a procedure outlined in the
technical report. The interviewer then administered a household
questionnaire to the most knowledgeable and willing member of the
household.
The interviewer then conducted interviews with as many adults
as possible, acquiring data about their individual activities and
health. Data for the children's questionnaires were obtained from
adults in the household. By virtue of the fact that an attempt was made
to obtain individual questionnaires for all members of households, the
sample constitutes a proper probability sample of individuals as well
as of households, without any special weighting. Actually, the fact
that we did not interview unmarried minors living temporarily outside
the domicile slightly diminishes the representativeness of the sample
of individuals in that age group.
As described above, the sample frame was essentially based on
dwellings in urban areas and households in rural areas. In conducting
Rounds VI-XII interviewers in both urban and rural areas attempted to
conduct interviews in the same dwellings (or spaces within communal
apartments and dormitories) that fell into the Round V sample. They
returned to each Round V dwelling even if the household in the dwelling
had refused to participate during previous rounds, and even if they
found out that the household whom they interviewed in previous rounds
had moved to a new dwelling prior to the interview.
Since the change in housing stock was minuscule between late 1994 and
late 1995, this procedure insured that the results in 1995 were
approximately as representative as they were in 1994. The response rate
was nearly the same: 84% in Round V; 80% in Round VI--both respectable
figures in survey research requiring such substantial face-to-face
interviews about every member of every household. Furthermore, by
returning to every dwelling we actually obtained interviews from some
200 households who had declined to participate in Round V. This should
eventually permit some analysis of the nature of non-response in Round
V--an analysis that would be more sophisticated than merely comparing
the demographic characteristics of households to those in the census.
It is especially important to notice that this procedure did not
appreciably vitiate our ability to conduct panel analyses with Round V
and VI data. First, it goes without saying that the data set renders it
quite easy to identify households and people who participated in both
rounds. Second, as it turned out, only 250 households (6.3%) from Round
V moved from their dwellings and were thus lost to Round VI--a low
level of attrition for a panel survey of this sort. Nevertheless, we
did gather data on their new addresses whenever possible in
anticipation of a supplementary study to follow up on them.
As stated above, the household response rate exceeded 80%. As
in Round V, individual questionnaires were obtained from over 97% of
the individuals listed on the household rosters. The response rates did
indeed vary across PSUs depending on the proportion of households in
rural areas. However, since we anticipated that in over-sampling, the
actual proportion of completed household interviews compares well to
the proportion of the population in each stratum. The distribution of
household size in the sample, within both rural and urban localities,
corresponds well to the figures from the 1989 census. Bear in mind that
single-member households are excluded from the comparison because the
census includes many institutionalized people, while our sample
explicitly excludes them. Thus, there is no valid basis for comparison.
The multivariate distribution of the sample by sex, age, and
urban-rural location compares quite well with the corresponding
multivariate distribution of the 1989 census. Of course, due to random
sampling error and changes in the distribution since the 1989 census,
we would not expect perfect correspondence. Nevertheless, there is
usually a difference of only one percentage point or less between the
two distributions.
Another way to evaluate the adequacy (or efficiency) of the
sample is to examine design effects. An important factor in determining
the precision of estimates in multi-stage samples is the mean ultimate
cluster (PSU) size. All else being equal, the larger the size, the
worse the precision. In Rounds I-IV of the RLMS, the average cluster
size approached 360--a large number dictated by constraints imposed by
our collaborators. Thus, although the sample size hovered around 6,000
households, precision was less than we would have liked for a sample of
that size. In Rounds I and III of the RLMS, the 95% confidence interval
for household income was about ±13%.
In the Phase II sample, the situation was considerably better. Although
there were only 4,000 households, the mean size of clusters was much
smaller than in Phase I. There were 35 PSUs with about 100 households
each; even this was an improvement over the average of 360 in the
design of the RLMS Rounds I-IV. However, in the three self-representing
areas, the respondents were drawn from 61 PSUs. Recall that Moscow city
and oblast, as well as St. Petersburg city, were not sampled, but were
chosen with certainty. Therefore, the first stage of selection in them
was the selection of census enumeration districts. Thus the mean
cluster size in all the sample was about 42, i.e., 4,000/(35+61). Given
these much smaller cluster sizes, we had reason to expect that
precision in this survey would be as good as it was in Rounds I-IV
despite the smaller sample size. This, in fact, turned out to be the
case in Rounds V-XIII.