Russia Longitudinal Monitoring Survey


Evaluation of Samples

Phase I

A sample of 7,200 households was drawn. Our hope was retain at least 5,000 households over the life of the panel survey. As Table 1 shows, this goal was met. The response rate started at 88.8%, remained above 80% for three rounds, and dropped to 76% only in the fourth round. These rates are quite respectable by international standards, and outstanding by Western European standards.

Within each household, data were solicited for all individuals residing therein, including unmarried minors attending school elsewhere. (Adults responded for children.) As the bottom row of Table 1 indicates, we obtained individual data from 89% to 97% of individuals in participating households during each round. Since we sought information about all household members rather than sampling only some of them, the figures represent the population of adult individuals (not just of households) without any special weighting except for non-responses.

Table 1: Household and Individual Response Rates
Round I Round II Round III Round IV
Dates of Field Work 7/92-10/92 12/92-3/93 7/93-10/93 10/93-1/94
Households out of 7,200 6,334 6,068 5,886 5,473
Household response rate 88.8% 84.3% 81.8% 76.0%
Individuals in
participating households
17,154 16,928 16,273 15,943
Individuals for whom
individual questionnaires
were obtained
16,641 15,121 15,720 14,282
Individual response
rate within
participating households
97.0% 89.3% 96.6% 89.6%

The sample can also be evaluated informally by comparing demographic statistics with corresponding parameters from the census. Bear in mind that a certain amount of demographic change has occurred since the census, so the census itself does not constitute a perfect standard.

The correspondence between household size in the census and in the Phase I sample is very good. For example, in urban areas, the census reports that two-person households constituted 33.1% of all multiperson households; in our sample, the percentage ranges from 32.7% to 34.8% over the four waves. In the census, 3.4% of all urban multiperson households consisted of six or more members; in our sample, from 2.9% to 3.7% do. Only occasionally are deviations as high as two percentage points observed. The deviations are somewhat higher in the rural areas. However, such deviations may be attributed to the fact that the definition of a household in the Soviet census differs from our definition, and the care taken in distinguishing multiple households living in a single-family residence was not the same in the census as in this study.

Next, consider some comparisons based on individual rather than household data. For example, in the 1989 census, males from 0 to 14 years of age living in urban areas constituted 8.37% of the total population; in Rounds I-IV of the RLMS, the corresponding percentages were 7.96%, 8.56%, 8.16%, and 8.71%, respectively--without using any corrective post-stratification weights. There are many similarly heartening comparisons. However, one can find some sharper deviations. For example, the total percentage of rural citizens was 23.2% in Round I rather than 26.57%, as in the 1989 census.

Also, consider the distribution of respondents by nationality (ethnicity). Please remember that the Phase I sample was not designed to represent all ethnic groups any more than the Current Population Survey or the General Social Survey in the U.S. is designed to represent, say, Vietnamese, Eskimos, or the Chinese in San Francisco. In the 1989 census, 81.5% of the Russian Federation claimed to be ethnic Russian, in Round I of this sample, 82.7%. In the census, 3.8% were Tatar, in this sample, 3.1%. In the census, 3.0% were Ukrainian, in this sample, 2.5%.

Similarly, consider the distribution of respondents' education. In the 1989 census, 6.5% of the Russian population aged 15 and older had completed three or fewer years of schooling; in our sample, 5.0% had done so. In the census, 27.4% had completed general secondary school; in our sample, 24.6% had done so. In the census, 11.3% claimed to have completed higher education; in our sample, 14% made this claim. This upward educational bias is far less than is typically observed in non-probability based surveys of the Russian Federation.

While the above figures are generally encouraging, they concern only demographic variables. We turn now to the calculation of the design effect for one of the most important variables: total household income. As an illustration, data from Round III are used. Since inflation occurred during field work, ruble amounts are deflated to June 1992 levels: the mean total household income was 7,950 rubles; the standard deviation was 12,585 rubles. (Incidentally, inflated to December 1994 levels, these amounts would be 628,050 rubles and 994,215 rubles, respectively.)

Most statistical packages (and consequently most analysts) disregard sample design effects. They are not calculated easily. Furthermore, they can vary for every variable in a questionnaire as well as for all composite variables. Nevertheless, given the small number of PSUs in this survey, it seemed necessary to perform the calculations to provide some assurance of the level of precision achieved. The results appear below in Table 2.

Table 2: Design Effects for Total Household Income
Number of PSUs Deft
(Square Root of
Design Effect)
Standard Error
in June, 1992
Rubles
Size of 95%
Confidence Interval
20 3.16 534 rubles ±13.2%
40 2.34 395 rubles ±9.7%
60 1.72 291 rubles ±7.2%
Simple Random Sample 1.00 169 rubles ±4.2%

Had this been a simple random sample of 5,546 households from the entire population of households in the Russian Federation, the design effect would have been precisely 1.00 by definition (see the bottom row). Using the standard formulas, the standard error would have been computed to be 169 rubles; the 95% confidence interval expressed in terms of the mean household income would have been ±4.2% (i.e., (1.96 * 169 rubles) / 7,950).

All national samples involve stratification and clustering to cut costs. The convenience and savings exact a toll: the confidence interval around the results (or the standard deviation of the results) becomes larger, i.e., precision is decreased. This is measured with the design effect, or with the square root of the design effect. In this survey, the design effect (DEFF) for total household income was about 9.975, based on data from Rounds I and III. Its square root (DEFT) is 3.16 (see the top row). In other words, the standard error (534 rubles) is 3.16 times as large as it would have been had we obtained these results from a simple random sample. Consequently, the precision is worse: ±13.2%. As the table reveals, had we employed 40 rather than 20 PSUs, we would have achieved an estimated precision level of ±9.7%; had we employed 60 rather than 20 PSUs keeping the same sample size, we would have achieved a precision level of ±7.2%. This constitutes a more reasonable point of comparison than a simple random sample, since no simple random sample of large countries is feasible.

In conclusion, please remember that sampling error is only one of several kinds of error that can taint the results of surveys. One must also be concerned about the quality of the questionnaire, of interviewer training and fieldwork, of data entry and cleaning--all of which were conducted in new ways in the RLMS Phase I project conducted with Goskomstat. Though we in fact pushed for a larger number of PSUs, in hindsight we see that we were perhaps unwittingly fortunate that Goskomstat's resources limited the number of PSUs to 20. This allowed us to concentrate sufficiently on the somewhat unmeasurable non-sampling aspects of quality while giving up a tolerable and quantifiable amount of sampling precision.

Phase II

For a detailed review of the Phase II sample, please read the Sample Attrition, Replenishment, and Weighting in Rounds V-VII report. The household response rate exceeded 80%. In both Rounds V and VI, individual questionnaires were obtained from over 97% of the people listed on the household rosters. The response rates did indeed vary across PSUs depending on the proportion of households in rural areas. However, since we anticipated that in over-sampling, the actual proportion of completed household interviews compares well to the proportion of the population in each stratum. Most entries differ by less than .004; St. Petersburg constituted the worst exception (.0294 rather than .0355).

The distribution of household size in the sample, within both rural and urban localities, corresponds well to the figures from the 1989 census. Bear in mind that single-member households are excluded from the comparison because the census includes many institutionalized people, while our sample explicity excludes them. Thus, there is no valid basis for comparison.

The multivariate distribution of the sample by sex, age, and urban-rural location compares quite well with the corresponding multivariate distribution of the 1989 census. Of course, due to random sampling error and changes in the distribution since the 1989 census, we would not expect perfect correspondence. Nevertheless, there is usually a difference of only one percentage point or less between the two distributions.

Another way to evaluate the adequacy (or efficiency) of the sample is to examine design effects. One of the important factors in determining the precision of estimates in multistage samples is the mean ultimate cluster (PSU) size. All else being equal, the larger the size, the worse the precision. In Phase I of the RLMS, the average cluster size approached 360--a large number dicated by constraints imposed by our collaborators. Thus, although the sample size hovered around 6,000 households, precision was less than we would have liked for a sample of that size.

In the Phase II sample, the situation was considerably better. Although there were only 4,000 households, the mean size of clusters was much smaller than in the Phase I sample. There were 35 PSUs with about 100 households each; even this was an improvement over the average of 360 in the design of the RLMS Phase II sample. However, in the three self-representing areas, the respondents were drawn from 61 PSUs. Thus, the mean cluster size in the entire the sample was about 42, i.e., 4,000/(35+61). Given these much smaller cluster sizes, we had reason to expect that precision in this survey would be as good as it was in Phase I, despite the smaller sample size. This, in fact, turned out to be the case in Round V, the first round of Phase II. The mean total household income was 510,146 rubles, with a 95% confidence interval of 65,950 (i.e. 12.9%).