Common Errors and How to Avoid Them
Here are some common errors that you can avoid.
Ignoring Clustering
Ignoring clustering and unequal probability of selection of participants in your analyses. This results in biased estimates and false-positive hypothesis test results. Avoid this error by using the svy commands for your analysis. If your analysis technique is not available with the svy commands, then use a command that allows pweight with the robust cluster() option.
Using the Wrong Weight Command
Using the wrong weight specification in Stata. For data from a sample survey, you should use the pweight command to define the sampling weight. Using any of the other weight commands (aweight, fweight, or iweight) can result in incorrect variance, standard errors, confidence intervals, and p-values.
Subsetting the Sample
Subsetting the sample when using the svy commands in Stata. These commands use the Taylor Series approximation for the variance estimation and must be able to correctly count the number of primary sampling units (PSUs) that were originally sampled. Subsetting the data may cause an incorrect number of PSU's to be used in the variance computation formula. Do not subset the data from a sample survey and always use the subpop option when using the svy commands to do sub-population analysis.
Stratum with only one PSU detected
You may get an error message when you try to run an svy command: "stratum with only one PSU detected". This happens when observations have values missing for variables in your model, resulting in their being dropped. An entire PSU may disappear as a result of missing values. Use the svydes command to identify the problem strata. A common fix is to combine a small stratum with an adjoining stratum. See the manual entry on svydes, or in Stata type findit svydes, or see http://www.stata.com/support/faqs/stat/stratum.html for details.
What set of observations is Stata analyzing?"
Using a subpop variable does not do the same thing as an -if-. In fact that's why the subpop option was invented. The -svy- commands use the whole dataset to help determine the standard error even if you are only looking at a subset of it (with a subpop var). During the time Stata is analyzing your data, Stata subsets to only those observations where ALL the following variables are non-missing:
- strata (if using one)
- psu (if using one)
- sample weight (if using one)
- subpop (if using one)
- analysis variable(s)**


