
National Institutes of Mental Health (NIMH)
Collaborative Psychiatric Epidemiology Survey Program (CPES) Data Set.
Integrated Weights and Sampling Error Codes for Design-based Analysis
Steven G. Heeringa, Patricia Berglund
Statistical Design Group, Survey Research Center, University of Michigan
June 4, 2007
Under contract to the National Institutes of Mental Health (NIMH), the Survey Research Center (SRC) has developed an integrated data base for the Collaborative Psychiatric Epidemiology (CPES) surveys: National Comorbidity Survey-Replication (NCS-R), National Survey of American Life (NSAL) and National Latino and Asian American Study (NLAAS). Heeringa, et al (2004) describe the sample designs and sample outcomes for the three CPES surveys. A general description of the survey methodology for the CPES surveys can be found in Pennell, et al. (2004).
This technical report outlines the method for integrating the design-based analysis weights and variance estimation codes for these three studies to permit analysts to approach analysis of the combined dataset as though it were a single, nationally-representative study.
The method of integrating the analysis of these three major survey programs was based on an adaptation of a multiple frame approach to estimation and inference for population characteristics (Hartley, 1962, 1974). There are several features and advantages to the method that are worth noting:
It was built on all of the study-specific weight development efforts conducted to date (Kessler et al. , 2004; Heeringa et al. 2004; Heeringa, et al. 2006).
It integrated overlapping representation of domains of the CPES survey population in a way that was mathematically transparent and easily understood by analysts of the combined data set. Given the large investments in study-specific weight development, this approach minimized the chance for conceptual or computational errors.
It was centered on the assumption that, conditional on the sample domain (e.g., block groups with 10-29.9% African American population) and the race/ethnicity of the respondent (e.g., Mexican-American), each study's sample representation based on the revised weight is proportional to the number of cases it "contributes" to the geographic domain x race/ethnicity cell.
The CPES survey population was defined by the union of the survey populations for the three component studies. This included adults age 18 and older, living in households in the 48 coterminous United States (NCS-R, NSAL). The survey population for the Latino and Asian ancestry groups extended to the State of Hawaii as well.
CPES analysts are free to define respondent groupings for analysis; however, for purposes of weight development twelve specific race/ancestry groupings were initially specified. These groupings are listed in Table 1. Due to the small number of persons of Other ancestry interviewed in the NCS-R, those individuals were combined with the White race category for purposes of the CPES weight computation.
| Table 1: Race/Ancestry Groupings Required For CPES Weight Development | |
|---|---|
| CPES Race/Ancestry Population Group | Survey Populations |
| Vietnamese | NCS-R, NLAAS |
| Filipino | NCS-R, NLAAS |
| Chinese | NCS-R, NLAAS |
| All Other Asian * | NCS-R, NLAAS |
| Cuban | NCS-R, NLAAS, NSAL |
| Puerto Rican | NCS-R, NLAAS, NSAL |
| Mexican | NCS-R, NLAAS |
| All Other Hispanic * | NCS-R, NLAAS, NSAL |
| Afro-Caribbean (non-Hispanic) | NCS-R, NSAL |
| African-American (non-Hispanic) | NCS-R, NSAL |
| White | NCS-R, NSAL |
| All Other (Pacific Islander, Native American, etc.) | NCS-R |
| * Based on NLAAS screening criteria | |
The breakdown of the full population into these 12 race/ancestry populations was a direct result of the specific eligibility and oversampling provisions of the NSAL and the NLAAS study designs. As shown in Table 1, NCS-R provided nearly universal coverage of all 12 race/ancestry groups. NSAL and NLAAS provided in-depth coverage of specific populations and with the exception of Afro-Caribbeans from Spanish language countries in the Caribbean (e.g., Cuba, Dominican Republic), the oversampling in each of these two studies did not overlap.
These 12 population groupings form the first dimension of a two-dimensional array that was used to apportion/adjust study-specific weights to create a new weight variable for integrated CPES analyses. These "population" groupings were defined at the respondent level. If individual respondents had multiple race/ancestry, they were assigned to a single category according to the priority order in the NLAAS and NSAL respondent classification rules (e.g. Afro-Caribbean taking preference over African-American, Vietnamese over Chinese). If ancestry for NCS-R cases could not be explicitly established at the level of detail required to map them into the NSAL or NLAAS population categories, they were stochastically assigned to a category based on the prevalence of each population in the Census Block Group in which the respondent's household was located.
The second dimension of the CPES weight computation array was defined based on the geographic domain of the U.S. national sample frame with which individual area segments for the three component samples were associated (see Heeringa et al., 2004). The "domain" groupings were assigned at the area segment level. All respondents from the same segment, regardless of population, were assigned to the same domain. Table 2 defines the 11 domain categories that were used to classify area segments and thereby assign each CPES respondents to a geographic domain.
| Table 2. Sample Frame Geographic Domains Required for CPES Weight Development | |
|---|---|
| CPES Domain | Domain Definition |
| 1 | Census Block Group >5% Cuban Population |
| 2 | Census Block Group >5% Vietnamese Population |
| 3 | Census Block Group >5% Filipino Population |
| 4 | Census Block Group >5% Puerto Rican Population |
| 5 | Census Block Group >5% Chinese Population |
| 6 | Census Block Group >10% Afro-Caribbean (non-Hispanic) (Restricted to NY, NJ, FL, CT, MA, RI and DC) |
| 7 | Census Block Group 60-100% African-American |
| 8 | Census Block Group 30-59.9% African-American |
| 9 | Census Block Group 10-29.9% African-American |
| 10 | Census Block Group 0-9.9% African-American |
| 11 | Hawaii (NLAAS only) |
All segment assignments to geographic domains were performed using Census 2000 data for Block Groups. Like the population assignments for mixed ancestry respondents, area segment domain assignments based on this 11 category classification were not always unique. For example a Census Block Group might have contained a population that was >5% Vietnamese and also >5% Chinese. In cases where a Census Block Group qualified for more than one domain, the corresponding area segment was assigned to the lowest numbered category (e.g. the high density Vietnamese domain for this last example).
Case-specific population weights had been developed for each CPES component survey (Kessler et al., 2004; Heeringa, 2004; Heeringa et al., 2005). Each project had carefully developed and refined its weight vector to enable robust probability sampling inference ("design-based") to its chosen survey population. NCS-R was unique among the three component studies in that it required two final analysis weights--one weight for the full sample of cases who participated in the Part 1 interview and a second for the subsample of cases that also completed Part 2 of the NCS-R. Consequently, the CPES combined data set also has two analysis weights--the first for analysis of common data items and the second for analysis of survey items that NCS-R only administered to Part 2 respondents.
The integrated weight development began with the existing final population weights for the NCS-R, NSAL, and NLAAS. The integrated weight development then proceeded according to the following steps:
Step 1. Each NLAAS, NSAL, and NCS-R case was assigned to a race/ancestry category based on the categories and priority order provided in Table 1 (see Section III).
Step 2. Each NLAAS, NSAL, and NCS-R area segment was assigned to a geographic domain based on the definitions and priority order shown in Table 2 (see Section IV). Each NLAAS, NSAL, and NCS-R respondent was assigned to a geographic domain based on its area segment classification.
Step 3. The final population weight values for the three data sets were obtained from the NLAAS, NSAL, and NCS-R investigators. Since the final NCS-R and NSAL weights had been "centered" or "normalized" (mean weight=1.0), they were restored to the original U.S. population scaling based on weighted totals from the March 2002 demographic supplement of the Current Population Survey ( CPS).
Step 4. Notation: each case in the CPES pooled data set was indexed as follows:
| Table 3. Subscript notation for weight integration expressions | ||
|---|---|---|
| Index Subscript | Values | Representing |
| i | 1,...,n | Individual sample case subscript |
| j | 1,2,3 | Study index, 1=NCS-R; 2=NSAL, 3=NLAAS |
| k | 1-11 | Population index (Table 1), collapsing White, Other |
| l | 1-11 | Domain index (Table 2) |
Step 5. From the pooled data set, EXCEL spreadsheets were used to compute the sums of nominal cases for each study by race/ancestry population by geographic domain cell. These counts were then aggregated across the three studies to produce CPES pooled case counts for each population x domain cell:

Step 6. The March 2002 CPS data enabled estimation of post-stratification control totals for each race/ancestry group, k=1,...,11; however, it did not provide the geographic detail needed to allocate the population total to the l=1,...,11 geographic domains. For this purpose, the weighted population distribution from the CPES study with the most robust estimates of geographic distribution was used. NLAAS was chosen as the basis for allocating the Asian and Hispanic populations to the 11 sample geographic domains. NSAL weighted sample distributions were used to apportion the African-American and Afro-Caribbean populations to the geographic domains. White and Other population totals were allocated to geographic domains based on the empirical distribution of weights in the NCS-R.

where:
= the CPES control total for race/ethnicity population k and domain l,
= the original study-specific weight for case i, study,
= the CPES study chosen to estimate the domain allocation for population k, and
= the March 2002 CPS population estimate for race/ethnicity category k.
Table 4 provides the final population controls for the race/ethnicity x domains cells of the CPES weight computation array.
Step 7: The original population weights from each study were post-stratified to the common race/ethnicity x domain population control totals derived from the March 2002 CPS (see Step 6 and Table 4).

where:
= the study specific weight adjusted to 2002 CPS population totals,
= the original study-specific population weight for case i, study j,
= the 2002 CPS estimate for race/ethnicity population k allocated to domain l.
Since the original study-specific weights for the major populations of interest had already included some form of population-based control, this rescaling to a common post-stratification standard did not require major adjustments.
Step 8. Since in Step 7 the individual study-specific weights were controlled to exact counts for each race/ancestry x geographic domain cell, the remaining step involved rescaling the study-specific weights to reflect the proportion of nominal cases that each study contributed to the cell in the pooled data set.

where:
= the CPES population weight for case i;
= the standard population weight for case i, study j, (assigned to population k and domain l).
Conditional on the assigned population (k) and domain classification (l), this rescaling provided a "proportionate to sample size" contribution from each study. It linearly rescaled the weights for each individual study. It did not alter the distribution of the study-specific population weights except to reduce the study specific mean by n+jkl/n++kl and the variance of the study weights (not relvariance) by a factor of (n+jkl/n++kl)2.
| Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 1 of 2) | ||||||
|---|---|---|---|---|---|---|
| Race/Ancestry Population Group | Sample Frame Geographic Domain | |||||
| CUBAN >5% | VIET <5% | FILIP >5% | PUERTO RICAN >5% | CHINESE >5% | AFRO-CARIB (see text) | |
| VIETNAMESE | 2480 | 383349 | 38422 | 41222 | 55101 | 0 |
| FILIPINO | 9950 | 94392 | 289714 | 10579 | 25166 | 0 |
| CHINESE | 48732 | 54578 | 240247 | 132608 | 541553 | 0 |
| OTHER ASIAN | 35654 | 19422 | 221116 | 6934 | 247125 | 0 |
| CUBAN | 103587 | 3643 | 5000 | 10334 | 5000 | 6447 |
| PUERTO RICAN | 48478 | 19246 | 39505 | 1009335 | 19457 | 78449 |
| MEXICAN | 3750 | 287255 | 70701 | 509666 | 69584 | 0 |
| OTHER HISPANIC | 202542 | 110127 | 24934 | 1109915 | 164478 | 111795 |
| AFRO-CARIBBEAN | 3687 | 2500 | 1250 | 126246 | 0 | 577382 |
| AFRIC-AMERICAN | 102567 | 68475 | 242531 | 1571404 | 27500 | 62549 |
| WHITE AND OTHER | 1086796 | 739518 | 1210972 | 2026304 | 2191741 | 1250 |
| Total | 1648223 | 1782505 | 2384392 | 6554547 | 3346705 | 837872 |
| Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 2 of 2) | ||||||
|---|---|---|---|---|---|---|
| Race/Ancestry Population Group | Sample Frame Geographic Domain | |||||
| AFRI-AMER 60-100% | AFRI-AMER 30-59.9% | AFRI-AMER 10-29.9% | AFRI-AMER 0-9.9% | HAWAII | Total | |
| VIETNAMESE | 24555 | 4348 | 179945 | 437448 | 3403 | 1170273 |
| FILIPINO | 11442 | 27032 | 389766 | 674184 | 421617 | 1953842 |
| CHINESE | 10720 | 50153 | 255197 | 1039660 | 223468 | 2596916 |
| OTHER ASIAN | 44172 | 53381 | 592307 | 1749290 | 360977 | 3330378 |
| CUBAN | 16347 | 11412 | 149585 | 805628 | 0 | 1116983 |
| PUERTO RICAN | 68727 | 3229 | 374734 | 536439 | 37466 | 2235065 |
| MEXICAN | 282346 | 67944 | 2942722 | 11529509 | 0 | 15763477 |
| OTHER HISPANIC | 165520 | 203503 | 759917 | 2592100 | 2945 | 5447776 |
| AFRO-CARIBBEAN | 288119 | 220638 | 198343 | 12119 | 0 | 1430284 |
| AFRIC-AMERICAN | 10596921 | 5110387 | 3545866 | 721516 | 0 | 22049716 |
| WHITE AND OTHER | 1943090 | 6258727 | 18867372 | 118404822 | 0 | 152730592 |
| Total | 13451959 | 12010754 | 28255754 | 138502715 | 1049876 | 209825302 |
As noted in Section V above, n=9282 NCS-R survey respondents completed Part 1 of the two-part CIDI-based interview; however, only a subsample of n=5692 NCS-R respondents went on to complete the more in-depth Part 2 questionnaire modules. All n=6082 NSAL and n=4649 NLAAS respondents completed the full interview schedule--the equivalent of NCS-R Parts 1 and 2. To account for the split schedule of NCS-R questionnaire administration, two CPES pooled analysis weights were computed using the calculation sequence described in steps (5) through (8) in Section V above. The first weight, termed the "Part 1" weight, is labeled CPESWTSH in the merged CPES data set. It is the population weight that should be used for analysis involving variables that are included in Part 1 of the NCS-R. The second weight, termed the "Part 2" weight, is stored in the merged data set as the variable, CPESWTLG. It is the population weight that should be used when the CPES analysis includes variables that NCS-R only asked of the Part 2 subsample of respondents.
Table 5 provides a summary of selected distributional statistics for the final CPES Part 1 and Part 2 analysis weights.
| Table 5: Distributions of CPES Final Part1 and Part 2 Analysis Weights | ||||||||
|---|---|---|---|---|---|---|---|---|
| Race/Ethnicity Population Group | CPES Final Analysis Weight | |||||||
| CPESWTSH | CPESWLTG | |||||||
| n | Mean | Minimum | Maximum | n | Mean | Minimum | Maximum | |
| VIETNAMESE | 527 | 2221 | 760 | 11898 | 526 | 2225 | 281 | 11898 |
| FILIPINO | 525 | 3722 | 726 | 15307 | 520 | 3757 | 726 | 15307 |
| CHINESE | 619 | 4195 | 754 | 15820 | 613 | 4236 | 934 | 15820 |
| OTHER ASIAN | 613 | 5433 | 587 | 20950 | 519 | 6417 | 732 | 27255 |
| CUBAN | 625 | 1787 | 267 | 20949 | 610 | 6416 | 731 | 27225 |
| PUERTO RICAN | 654 | 3417 | 128 | 10846 | 620 | 3604 | 75 | 16372 |
| MEXICAN | 1442 | 10931 | 781 | 49382 | 1214 | 12984 | 373 | 89022 |
| OTHER HISPANIC | 899 | 6060 | 533 | 44888 | 820 | 6644 | 591 | 40582 |
| AFRO-CARIBBEAN | 1492 | 959 | 111 | 20796 | 1476 | 969 | 162 | 21040 |
| AFRIC-AMERICAN | 4746 | 4646 | 728 | 18594 | 4249 | 5189 | 978 | 36257 |
| WHITE AND OTHER | 7871 | 19362 | 1250 | 131331 | 5256 | 28800 | 1250 | 195000 |
| Total Sample | 20013 | 10468 | 111 | 131331 | 16423 | 12693 | 75 | 195000 |
The weights have been designed to enable analysts to compute unbiased or nearly unbiased estimates of population statistics and relationships (e.g. bivariate associations, regression relationships) for the larger CPES survey population of U.S. residents. Contemporary statistical software systems such as SAS, Stata, SPSS, and SUDAAN all provide the capability to conduct weighted analysis of the CPES survey data. CPES data analysts are encouraged to consult the user guides and help support for their chosen software package to learn the syntax and program specific features for conducting weighted analysis. The following paragraphs provide guidance on weighted analysis that is specific to the CPES data set.
VI.B. 1 Part 1 or Part 2 Weight ?
CPES analysts should consult the data documentation to determine if variables of interest in their analysis were obtained in Part 1 or Part 2 on the NCS-R interview. If the analysis includes only Part 1 variables, the CPESWTSH analysis weight should be used. It will include the full sample of NCS-R cases and provide greatest precision for sample estimates of population characteristics or relationships. If the analysis includes one or more variables that NCS-R collected only in Part 2, the appropriate weight for population estimation is CPESWTLG.
In the calculation of the Part 1 and Part 2 weights, the absolute contributions from NSAL and NLAAS to the pooled weight calculation remained unchanged--only NCS-R required changes to the nominal case counts and initial rescaling steps. However, due to the reduced NCS-R sample size for the Part 2 variables, the relative contributions of the NSAL and NLAAS to any give race/ethnicity x sample domain weighting cell did change. Therefore the final CPES Part 1 and Part 2 analysis weights (Step 8 above) differ for NSAL and NLAAS cases as they do for the NCS-R cases.
VI.B.2 Subsetting the CPES data by study
Occasionally, analysts may choose to extract CPES data for only one or two of the three component data sets. The CPES analysis weights will support this type of analysis; however, analysts should recognize that the sum of weights for this special CPES subset may not sum to the population control for that population. For example, consider an analysis which only used Afro-Caribbean data from the NLAAS and NSAL. Since a small number of Afro-Caribbeans interviewed in the NCS-R would be excluded from this analysis, the sum of weights for the combined NSAL and NLAAS cases would no longer match the CPES population control total for the Afro-Caribbean race/ethnicity population. A principle of weighted analysis of data is that population estimates and sampling errors (except for estimates of totals) should be invariant to any linear scaling of the weights (multiplication or division by a constant). Under the procedures used to compute the CPES Part 1 and Part 2 weights, this assumption of linear scaling applies when the data for one or two studies are used independently or are compared.
VI.B.3 Subsetting the CPES data based on characteristics or respondents
In general, CPES analysts can apply the analysis weights for subpopulation analysis (e.g., estimation for women of Mexican-American ancestry). Provided all qualifying cases in the CPES data are included in the subpopulation analysis, the estimates would be unbiased and the sum of the CPES weights would be an unbiased estimator of the 2002 population count for that subset of the larger U.S. population. Experience has shown that due to the sheer numbers of observations and richness of the variable set, data sets such as the CPES generate interest in rare populations or populations for which the original samples were not optimal (e.g. women of Mexican-American ancestry living in the West Census Region and covered by a regional health maintenance organization (HMO) program). CPES analysts who have concerns about the appropriateness of the CPES for subpopulation analysis they are proposing to conduct are encouraged to consult a survey statistician.
VI.B.4 Item Missing Data for Analysis Variables
The original NCS-R, NSAL, and NLAAS analysis weights included adjustments for survey nonresponse. Through the process used to create the combined analysis weights, these adjustments for differential nonresponse are preserved in the CPES Part 1 and Part 2 weights. However, the CPES weights do not include adjustments for item missing data in the CPES data set. With a few special exceptions, most statistical software packages employ "case-wise" deletion as the means to address the problem of missing values for the variables. That is, any case with a missing value on one or more variables (e.g. fitting a multivariate logistic regression model) will cause the case to be dropped from the analysis. If the amount of such case-wise deletion is substantial, the unbiasedness of the weighted estimation may be compromised. Analysts are encouraged to use standard data checking techniques to establish the patterns of missing data in their analysis variables and assess the extent and impact of software-driven case-wise deletion on the integrity of their analysis. If the variables of interest have high rates of item missing data, analysts may consider consulting a survey statistician to consider remediation approaches such as stochastic imputation (Little and Rubin, 2002).
The CPES data set is the product of the merger of three probability samples of the U.S. population and therefore shares the primary stage sample stratification and clustering features of the component sample designs. The NCS-R, NSAL and NLAAS sample designs were very similar in their basic structure to the multi-stage designs used for major survey programs such as the U.S. Health Interview Survey (HIS), the National Survey of Family Growth (NSFG) or the other national scientific surveys. The survey literature refers to the these samples as complex designs, a loosely-used term meant to denote the fact that the sample incorporates special design features such as stratification, clustering and differential selection probabilities (i.e., weighting) that analysts must consider in computing sampling errors for sample estimates of descriptive statistics and model parameters. Standard programs in statistical analysis software packages assume simple random sampling (SRS) or independence of observations in computing standard errors for sample estimates. In general, the SRS assumption results in underestimation of variances of survey estimates of descriptive statistics and model parameters. Confidence intervals based on computed variances that assume independence of observations will be biased (generally too narrow) and design-based inferences will be affected accordingly. Likewise, test statistics (t, X2, F) computed in complex survey data analysis using standard programs will tend to be biased upward and overstate the significance of tests of effects.
This section focuses on sampling error estimation and construction of confidence intervals for survey estimates of descriptive statistics such as means, proportions, ratios, and coefficients for linear and logistic regression models.
Over the past 50 years, advances in survey sampling theory have guided the development of a number of methods for correctly estimating variances from complex sample data sets. Sampling error programs that implement these complex sample variance estimation methods are available to CPES data analysts. The two most common approaches (Rust, 1985) to the estimation of sampling error for complex sample data are through the use of a Taylor Series linearization of the estimator (and corresponding approximation to its variance) or through the use of resampling variance estimation procedures such as Balanced Repeated Replication (BRR) or Jackknife Repeated Replication (JRR).
When survey data are collected using a complex sample design with unequal size clusters, most statistics of interest will not be simple linear functions of the observed data. The linearization approach applies Taylor's method to derive an approximate form of the estimator that is linear in statistics for which variances and covariances can be directly and easily estimated. Stata Release 8 and 9, SAS V8.2/V9.0, SUDAAN Version 9, and the most recent releases of SPSS are commercially available statistical software packages that include procedures that apply the Taylor Series method to sampling error estimation and inference for complex sample data.
Stata (StataCorp, 2005) is a more recent commercial entry to the available software for analysis of complex sample survey data and has a growing body of research users. Stata includes special versions of its standard analysis routines that are designed for the analysis of complex sample survey data. Special survey analysis programs are available for descriptive estimation of means (SVY MEAN), ratios (SVY RATIO), proportions (SVY TAB), and population totals (SVY TOTAL). STATA programs for multivariate analysis of survey data include linear regression (SVY REGRESS), logistic regression (SVY LOGIT) and probit regression (SVY PROBT). STATA program offerings for survey data analysts are constantly being expanded. Information on the STATA analysis software system can be found on the Web at: http://www.stata.com.
Programs in SAS Version 9 (SAS, 2003; http://www.sas.com/) also use the Taylor Series method to estimate variances of means (PROC Surveymeans), proportions and cross-tabular analysis (PROC SurveyFreq), linear regression (PROC SurveyReg), and logistic regression (PROC SurveyLogistic).
SUDAAN (RTI, 2004) is a commercially available software system developed and marketed by the Research Triangle Institute of Research Triangle Park, North Carolina (USA). SUDAAN was developed as a stand-alone software system with capabilities for the more important methods for descriptive and multivariate analysis of survey data, including: estimation and inference for means, proportions, and rates (PROC DESCRIPT and PROC RATIO); contingency table analysis (PROC CROSSTAB); linear regression (PROC REGRESS); logistic regression (PROC LOGISTIC); log-linear models (PROC CATAN); and survival analysis (PROC SURVIVAL). SUDAAN V9.0 and earlier versions were designed to read directly from ASCII and SAS system data sets. The latest versions of SUDAAN permit procedures to be called directly from the SAS system. Information on SUDAAN is available at the following Web site address: http://www.rti.org/.
SPSS Version 14.0 (http:// www.spss.com/) users can obtain the SPSS Complex Samples module which supports Taylor Series linearization estimation of sampling errors for descriptive statistics (CSDESCRIPTIVES), cross-tabulated data (CSTABULATE), general linear models (CSGLM), and logistic regression (CSLOGISTIC).
BRR, JRR, and the bootstrap comprise a second class of nonparametric methods for conducting estimation and inference from complex sample data. As suggested by the generic label for this class of methods, BRR, JRR, and the bootstrap utilize replicated subsampling of the sample database to develop sampling variance estimates for linear and nonlinear statistics. WesVar PC (Westat, Inc., 2000) is a software system for personal computers that employs replicated variance estimation methods to conduct the more common types of statistical analysis of complex sample survey data. WesVar PC was developed by Westat, Inc. and is distributed along with documentation to researchers at Westat's Web site: http://www.westat.com/wesvarpc/ . WesVar PC includes a Windows-based application generator that enables the analyst to select the form of data input (SAS data file, SPSS for Windows data base, ASCII data set) and the computation method (BRR or JRR methods). Analysis programs contained in WesVar PC provide the capability for basic descriptive (means, proportions, totals, cross tabulations) and regression (linear, logistic) analysis of complex sample survey data. WesVar also provides the best facility for estimating quantiles of continuous variables (e.g., 95%-tile of a cognitive test score) from survey data. WesVar Complex Samples 4.0 is the latest version of WesVar. Researchers who wish to analyze the CPES data using WesVar PC should choose the BRR or JRR (JK2) replication option.
STATA V9 has introduced the option to use JRR or BRR calculation methods as an alternative to the Taylor Series method for all of its svy command options. SUDAAN V9.0 also allows the analysts to select the JRR method for computing sampling variances of survey estimates.
IVEWare is another software option for the JRR estimation of sampling errors for survey statistics. IVEWare has been developed by the Survey Methodology Program of the Survey Research Center and is available free of charge to users at: http://www.isr.umich.edu/src/smp/ive/ . IVEWare is based on SAS Macros and requires SAS Version 6.12 or higher. The system includes programs for multiple imputation of item missing data as well as programs for variance estimation in descriptive (means, proportions) and multivariate (regression, logistic regression, survival analysis) analysis of complex sample survey data.
These new and updated software packages include an expanded set of user-friendly, well-documented analysis procedures. Difficulties with sample design specification, data preparation, and data input in the earlier generations of survey analysis software created a barrier to use by analysts who were not survey design specialists. The new software enables the user to input data and output results in a variety of common formats, and the latest versions accommodate direct input of data files from the major analysis software systems.
Regardless of whether the linearization method or a resampling approach is used, estimation of variances for complex sample survey estimates requires the specification of a sampling error computation model. CPES data analysts who are interested in performing sampling error computations should be aware that the estimation programs identified in the preceding section assume a specific sampling error computation model and will require special sampling error codes. Individual records in the analysis data set must be assigned sampling error codes that identify to the programs the complex structure of the sample (stratification, clustering) and are compatible with the computation algorithms of the various programs. To facilitate the computation of sampling error for statistics based on CPES data, design-specific sampling error codes will be routinely included in all versions of the data set. Although minor recoding may be required to conform to the input requirements of the individual programs, the sampling error codes that are provided should enable analysts to conduct either Taylor Series or Replicated estimation of sampling errors for survey statistics. In programs that use the Taylor Series Linearization method, the sampling error codes (SESTRAT and SECLUSTR) will typically be input as keyword statements (SAS V9.1, SUDAAN V9.0) or as global settings (Stata V9) along with the analysis weight and will be used directly in the computational algorithms. Programs that permit BRR or JRR computations will require the user supplied sampling error codes to construct "replicate weights" that are required for these approaches to variance estimation.
Two sampling error code variables are defined for each case based on the sample design stratum and primary stage unit (PSU) cluster in which the sample respondent resided: Sampling Error Stratum Code (SESTRAT) and Sampling Error Cluster Code (SECLUSTR). The CPES SESTRAT codes were derived directly from a concatenation of the existing sampling error stratum codes for the NCS-R, NSAL and NLAAS sample designs. A total of 180 sampling error strata were defined. These were allocated to the individual contributing samples according to the coding scheme shown in Table 6.
| Table 6. CPES Sampling Error Strata | |
|---|---|
| CPES Component Sample | CPES Sampling Error Strata |
| NCS-R | 1-42 |
| NSAL | 43-111 |
| NLAAS | 112-180 |
All original sampling error strata definitions for the NCS-R and NLAAS were preserved unchanged in the mapping to the CPES sampling error stratum code. In general, the assignment of NSAL cases to CPES sampling error strata also followed the original NSAL coding. The single exception involved a NSAL sampling error stratum that included multiple clusters. This stratum was divided into several pseudo-strata each with a pair of combined clusters. This minor change enables CPES analysts to use any of the sampling error calculation methods (Taylor, BRR or JRR) without having to perform additional recoding of the sampling error variables.
Likewise, with one exception, the values of SECLUSTR for CPES sampling error strata are identical to those in the original NCS-R, NSAL and NLAAS data sets. The exception was the cluster numbering for the one NSAL sampling error stratum with multiple clusters. Clusters in this stratum were randomly grouped into pairs and assigned to pseudo-strata as described in the preceding paragraph. The result is that the CPES SECLUSTR code takes a value of either 1 or 2 and exactly two sampling error clusters are assigned to each sampling error stratum.
The following two sections provide a short overview of the general syntax and command file structure for computing sampling errors using STATA and SAS programs that have been designed for the analysis of complex sample survey data. Analysts are referred to the user guides and the on-line help facilities of these two software systems for documentation of the individual programs.
VII.E.1 Stata command syntax
As described above, CPES data analysts who are familiar with the STATA software system can utilize STATA's "svy" commands for the analysis of complex sample survey data. STATA Version 9 syntax for some of the more commonly used analysis programs is illustrated below (shown for the Part 2 weight option) :
.svyset seclustr [pweight=cpeswgtl], strata(sestrat)This statement defines the sample design variables for the duration of the analysis session. SVY commands issued after this statement will automatically incorporate these design specifications.
To conduct analyses, the following STATA commands and syntax are used (please refer to STATA V9 Reference Manual for specific command syntax and output options):
.svy, vce(linearized): mean vars[estimates, standard errors, design effects for means]
.svy, vce(linearized): tab v1 v2[estimates, standard errors for proportions of single variable categories, or crosstabulations of two variables with tests of independence]
.svy, vce(linearized): regress dep x1 ...[simple linear regression model for a continuous dependent variable]
.svy, vce(linearized): logit dep x1...[simple logistic regression model for a binary dependent variable]
To estimate the single statistics or regression models for subpopulations of the survey population in STATA, the following optional syntax is used (illustrated for svytab):
.svy, vce(linearized): tab v1 v2, over(var) where var is a categorical variable that defines the subpopulations for which separate estimates are desired (e.g. gender).
VII.E.2 SAS Version 9 Command Syntax
SAS Version 9 includes four programs for the analysis of complex sample survey data: PROCS Surveymeans, SurveyFreq, SurveyReg and SurveyLogistic. The general syntax for specifying the CPES design structure in the SAS system is as follows:
PROC SurveyXXXX data=libname.filename;
STRATUM SESTRAT;
CLUSTER SECLUSTR;
WEIGHT CPESWTLG;
program specific statements here;
RUN;Users are referred to the SAS/STAT(R) 9.1 User's Guide (SAS, 2004) for documentation on program specific statements, keywords and options
Final weights were also developed for analyzing pairs of the CPES studies (NCS-R and NSAL, NCS-R and NLAAS, and NSAL and NLAAS) and for Part I and Part II sub-samples (only for pairs that include NCS-R respondents). This will allow for generating population estimates by analyzing data from study pairs only. Table 7 below summarizes the paired weights and key descriptive statistics for the weight distributions.
| Table 7: Descriptive Statistics for Paired Study Weights | ||||||
|---|---|---|---|---|---|---|
| Study Pair | Sample | Variable Name | Sample Size | Mean Weight | Standard Deviation of Weights | Sum of Weights |
| NCS-R and NSAL | Short | NCNSWTSH | 15364 | 13588.6116 | 13251.1943 | 208775428 |
| NCS-R and NSAL | Long | NCNSWTLG | 11774 | 17731.9026 | 26960.0142 | 208775421 |
| NCS-R and NLAAS | Short | NCNLWTSH | 13931 | 15061.7548 | 11814.4696 | 209825306 |
| NCS-R and NLAAS | Long | NCNLWTLG | 10341 | 20290.6198 | 26553.902 | 209825299 |
| NSAL and NLAAS | ---a | NSNLWT | 10731 | 19553.1922 | 63077.1289 | 209825306 |
a. Short and long sub-samples don't not apply as NCS-R sample is not included.
Weights for analysis of CPES study pairs are based on the final CPES 3-study weights and were developed according to the following steps:
Step 1. Each NLAAS, NSAL and NCS-R case was assigned to a race/ancestry category based on the categories and priority order provided in Table 1 (see Section III).
Step 2. Each NLAAS, NSAL and NCS-R area segment was assigned to a geographic domain based on the definitions and priority order shown in Table 2 (see Section IV). Each NLAAS, NSAL and NCS-R respondent was assigned to a geographic domain based on its area segment classification.
Step 3. For each pair of studies, race x domain cell counts were obtained. Due to the lack of over-sampling of certain race groups in specific pairs (Asians in NCS-R and NSAL) some of the CPES race x domain cells had no cases or a small number of cases. Such small cell counts could affect the robustness of post-adjustments and are usually dealt with by collapsing. Collapsing was mainly done over similar race groups (e.g., Vietnamese, Filipino, Chinese and other Asian groups) within a domain. Similar domains were then collapsed (examples of collapsed domains include Census Block Group > 5% Cuban and include Census Block Group > 5% Puerto Rican) if the cell count was still small (mainly <10) after race group collapsing. The same collapsed groups were used for the long and the short sub samples (whenever applicable). Collapsed groups for each pair of studies are shown in Tables 8-12.
Step 4. CPS 2002 totals were calculated for each collapsed group. Weighted counts using the final CPES weight (short form when dealing with short sample and long form when dealing with long sub-sample) were also generated for each collapsed group.
Step 5. A post-stratification adjustment factor (CPS 2002 total divided by the weighted count using CPES final weight) was calculated and applied to the final CPES weight to generate the paired weights. Respondents' weights for cases belonging to the same race x domain received the same factor.
| Table 8: NCS-R and NSAL collapsed cells for the short sample, post-stratification adjustment factor and mean weights | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NSAL short | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtsh) |
| Asians over all domains | 8041944 | 189 | 959391 | 8.382342549 | 5076.142857 |
| Cuban, PR and other Hisp over all domains | 8759413 | 492 | 2057662 | 4.256973692 | 4182.239837 |
| Mexicans over all domains | 15763477 | 574 | 6051909 | 2.604711505 | 10543.39547 |
| Africocarib and AA in Cuban blocks | 106254 | 35 | 106253 | 1.000009411 | 3035.8 |
| Whites in Cuban blocks | 1086796 | 26 | 611832 | 1.776298069 | 23532 |
| Afrocarib and AA in Vit blocks | 70975 | 30 | 70976 | 0.999985911 | 2365.866667 |
| Whites in in Vit blocks | 739518 | 37 | 691510 | 1.069424882 | 18689.45946 |
| Afrocarib and AA in Flip blocks | 243781 | 55 | 243780 | 1.000004102 | 4432.363636 |
| Whites in Flip blocks | 1210972 | 65 | 944809 | 1.281710907 | 14535.52308 |
| Afrocarib in PR blocks | 126246 | 57 | 126245 | 1.000007921 | 2214.824561 |
| AA in PR blocks | 1571404 | 230 | 1571412 | 0.999994909 | 6832.226087 |
| Whites in PR blocks | 2026304 | 130 | 2014461 | 1.005878992 | 15495.85385 |
| Afrocarib and AA in Chinese blocks | 27500 | 22 | 27500 | 1 | 1250 |
| Whites in Chinese blocks | 2191741 | 123 | 2666727 | 0.82188428 | 21680.70732 |
| Afrocarib in Afrocarib blocks | 577382 | 933 | 577386 | 0.999993072 | 618.8488746 |
| AA in Afrocarib blocks | 62549 | 12 | 62549 | 1 | 5212.416667 |
| Whites in Afrocarib and AA 60-100 | 1944340 | 267 | 1944339 | 1.000000514 | 7282.168539 |
| Afrocarib in AA 60-100 | 288119 | 256 | 288116 | 1.000010412 | 1125.453125 |
| AA in AA 60-100 | 10596921 | 2343 | 10596905 | 1.00000151 | 4522.793427 |
| Afrocarib in AA 30-59.9 | 220638 | 172 | 220642 | 0.999981871 | 1282.802326 |
| AA in AA 30-59.9 | 5110387 | 1122 | 5110400 | 0.999997456 | 4554.723708 |
| Whites in AA 30-59.9 | 6258727 | 638 | 6258683 | 1.00000703 | 9809.847962 |
| Afrocarib in AA 10-29.9 | 198343 | 48 | 198344 | 0.999994958 | 4132.166667 |
| AA in AA 10-29.9 | 3545866 | 703 | 3545843 | 1.000006486 | 5043.8734 |
| Whites in AA 10-29.9 | 18867372 | 1216 | 18867235 | 1.000007261 | 15515.81826 |
| Afrocarib and AA in AA 0-9.9 | 733635 | 220 | 733649 | 0.999980917 | 3334.768182 |
| Whites in AA 0-9.9 | 118404822 | 5369 | 118405824 | 0.999991538 | 22053.60849 |
| Table 9: NCS-R and NSAL collapsed cells for the long sample, post-stratification adjustment factor and mean weights | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NSAL long | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnswtlg) |
| Asians over all domains | 8041944 | 83 | 461432 | 17.42823211 | 5559.421687 |
| Cuban, PR and other Hisp over all domains | 8759413 | 364 | 1524027 | 5.747544499 | 4186.887363 |
| Mexican over all domains | 15763477 | 346 | 4312401 | 3.655382929 | 12463.58671 |
| Afrocarib and AA in Cuban blocks | 106254 | 25 | 106254 | 1 | 4250.16 |
| Whites in Cuban blocks | 1086796 | 18 | 1055336 | 1.029810411 | 58629.77778 |
| Afrocarib and AA in Vit blocks | 70975 | 21 | 70975 | 1 | 3379.761905 |
| Whites in Vit blocks | 739518 | 23 | 739527 | 0.99998783 | 32153.34783 |
| Afrocarib and AA in Flip blocks | 243781 | 47 | 243782 | 0.999995898 | 5186.851064 |
| Whites in Flip blocks | 1210972 | 48 | 1210979 | 0.99999422 | 25228.72917 |
| Afrocarib in PR blocks | 126246 | 55 | 126245 | 1.000007921 | 2295.363636 |
| AA in PR blocks | 1571404 | 202 | 1571417 | 0.999991727 | 7779.292079 |
| Whites in PR blocks | 2026304 | 90 | 2026314 | 0.999995065 | 22514.6 |
| Afrocarib and AA in Chinese blocks | 27500 | 14 | 27500 | 1 | 1964.285714 |
| Whites in Chinese blocks | 2191741 | 71 | 2191747 | 0.999997262 | 30869.67606 |
| Afrocarib in Afrocarib blocks | 577382 | 933 | 577386 | 0.999993072 | 618.8488746 |
| AA in Afrocarib blocks | 62549 | 12 | 62549 | 1 | 5212.416667 |
| Whites in Afrocarib and AA 60-100 | 1944340 | 221 | 1909508 | 1.018241348 | 8640.307692 |
| Afrocarib in AA 60-100 | 288119 | 253 | 288116 | 1.000010412 | 1138.798419 |
| AA in AA 60-100 | 10596921 | 2137 | 10596881 | 1.000003775 | 4958.765091 |
| Afrocarib in AA 30-59.9 | 220638 | 170 | 220642 | 0.999981871 | 1297.894118 |
| AA in AA 30-59.9 | 5110387 | 1025 | 5110397 | 0.999998043 | 4985.753171 |
| Whites in AA 30-59.9 | 6258727 | 509 | 5852364 | 1.069435702 | 11497.76817 |
| Afrocarib in AA 10-29.9 | 198343 | 46 | 198345 | 0.999989917 | 4311.847826 |
| AA in AA 10-29.9 | 3545866 | 621 | 3545850 | 1.000004512 | 5709.903382 |
| Whites in AA 10-29.9 | 18867372 | 891 | 20300571 | 0.92940105 | 22784.0303 |
| Afrocarib and AA in AA 0-9.9 | 733635 | 164 | 733636 | 0.999998637 | 4473.390244 |
| Whites in AA 0-9.9 | 118404822 | 3385 | 116088323 | 1.019954625 | 34294.92555 |
| Table 10: NCS-R and NLAAS collapsed cells for the short sample, post-stratification adjustment factor and mean weights | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS short | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtsh) |
| Asians in Cuban blocks | 96816 | 18 | 96803 | 1.000134293 | 5377.944444 |
| Cubans in Cuban blocks | 103587 | 70 | 103588 | 0.999990346 | 1479.828571 |
| PR in Cuban blocks | 48478 | 26 | 48478 | 1 | 1864.538462 |
| Mexican and other His in Cuban blocks | 206292 | 46 | 201688 | 1.022827337 | 4384.521739 |
| Afrocarib and AA in Cuban blocks | 106254 | 19 | 53552 | 1.984127577 | 2818.526316 |
| Whites in Cuban blocks | 1086796 | 25 | 588300 | 1.847349992 | 23532 |
| Vit in Vit blocks | 383349 | 208 | 383349 | 1 | 1843.024038 |
| Flip in Vit blocks | 94392 | 19 | 94392 | 1 | 4968 |
| Chinese and Other Asians in Vit blocks | 74000 | 18 | 74000 | 1 | 4111.111111 |
| Cuban, PR and Mexican in Vit blocks | 310144 | 35 | 310139 | 1.000016122 | 8861.114286 |
| Other Hispanics in Vit blocks | 110127 | 13 | 110127 | 1 | 8471.307692 |
| Afrocarib and AA in Vit blocks | 70975 | 21 | 48966 | 1.449475146 | 2331.714286 |
| Whites in Vit blocks | 739518 | 37 | 691510 | 1.069424882 | 18689.45946 |
| Vit in Flip Blocks | 38422 | 17 | 38422 | 1 | 2260.117647 |
| Flip in Flip Blocks | 289714 | 83 | 289714 | 1 | 3490.53012 |
| Chinese in Flip Blocks | 240247 | 68 | 240247 | 1 | 3533.044118 |
| other Asians in Flip blocks | 221116 | 39 | 221116 | 1 | 5669.641026 |
| Cuban and PR in Flip blocks | 44505 | 13 | 44505 | 1 | 3423.461538 |
| Mexican and other Hisp in Flip block | 95635 | 13 | 95634 | 1.000010457 | 7356.461538 |
| Africocarib and AA in Flip blocks | 243781 | 25 | 110449 | 2.207181595 | 4417.96 |
| Whites in Flip blocks | 1210972 | 51 | 741314 | 1.633547997 | 14535.56863 |
| Vit and Flip in PR blocks | 51801 | 14 | 51801 | 1 | 3700.071429 |
| Chinese and other Asians in PR blocks | 139542 | 22 | 139541 | 1.000007166 | 6342.772727 |
| Cuban and PR in PR blocks | 1019669 | 264 | 980425 | 1.040027539 | 3713.731061 |
| Mexican in PR blocks | 509666 | 44 | 509661 | 1.00000981 | 11583.20455 |
| other Hisp in PR blocks | 1109915 | 215 | 1074917 | 1.032558793 | 4999.613953 |
| Afrocarib and AA in PR blocks | 1697650 | 71 | 420440 | 4.03779374 | 5921.690141 |
| Whites in PR block | 2026304 | 110 | 1704545 | 1.18876533 | 15495.86364 |
| Table 10: NCS-R and NLAAS collapsed cells for the short sample, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS short | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtsh) |
| Vit in Chinese blocks | 55101 | 19 | 55101 | 1 | 2900.052632 |
| Flip in Chinese blocks | 25166 | 10 | 25165 | 1.000039738 | 2516.5 |
| Chinese in Chinese blocks | 541553 | 119 | 541553 | 1 | 4550.865546 |
| other Asians in Chinese block | 247125 | 49 | 247126 | 0.999995953 | 5043.387755 |
| Cuban, PR and Mexican in Chinese blocks | 94041 | 23 | 94041 | 1 | 4088.73913 |
| Other Hispanics in Chinese blocks | 164478 | 24 | 164479 | 0.99999392 | 6853.291667 |
| Afrocarib and AA in Chinese blocks | 27500 | 22 | 27500 | 1 | 1250 |
| Whites in Chinese blocks | 2191741 | 123 | 2666727 | 0.82188428 | 21680.70732 |
| Asians in Africocarib and AA 60-100 blocks | 90889 | 16 | 90889 | 1 | 5680.5625 |
| Cuban and PR in Africocarib and AA 60-100 | 169970 | 28 | 61807 | 2.750012135 | 2207.392857 |
| Mexican in Africocarib and AA 60-100 | 282346 | 27 | 282351 | 0.999982292 | 10457.44444 |
| Other His in Africocarib and AA 60-100 | 277315 | 19 | 101448 | 2.733567936 | 5339.368421 |
| Whites in Africocarib and AA 60-100 | 1944340 | 91 | 664740 | 2.924963143 | 7304.835165 |
| Asians in AA 30-59.9 | 134914 | 24 | 134913 | 1.000007412 | 5621.375 |
| Cubans, PR, and Mexicans in AA 30-59.9 | 82585 | 31 | 77710 | 1.062733239 | 2506.774194 |
| Other His in AA 30-59.9 | 203503 | 30 | 152627 | 1.333335517 | 5087.566667 |
| Whites in AA 30-59.9 | 6258727 | 302 | 2962549 | 2.112615521 | 9809.764901 |
| Vit in AA 10-29.9 | 179945 | 71 | 179945 | 1 | 2534.43662 |
| Flip in AA 10-29.9 | 389766 | 93 | 389766 | 1 | 4191.032258 |
| Chinese in AA 10-29.9 | 255197 | 62 | 255197 | 1 | 4116.080645 |
| Other Asian in AA 10-29.9 | 592307 | 88 | 592305 | 1.000003377 | 6730.738636 |
| Cubans in AA 10-29.9 | 149585 | 81 | 147761 | 1.012344259 | 1824.209877 |
| PR in AA 10-29.9 | 374734 | 94 | 366927 | 1.021276712 | 3903.478723 |
| Mexicans in AA 10-29.9 | 2942722 | 216 | 2942725 | 0.999998981 | 13623.72685 |
| Other Hisp in AA 10-29.9 | 759917 | 99 | 737569 | 1.030299538 | 7450.191919 |
| Whites in AA 10-29.9 | 18867372 | 925 | 14352204 | 1.314597535 | 15515.89622 |
| Vit in AA 0-9.9 | 437448 | 188 | 437448 | 1 | 2326.851064 |
| Flip in AA 0-9.9 | 674184 | 179 | 674183 | 1.000001483 | 3766.385475 |
| Chinese in AA 0-9.9 | 1039660 | 259 | 1039660 | 1 | 4014.131274 |
| other asian in AA 0-9.9 | 1749290 | 320 | 1749288 | 1.000001143 | 5466.525 |
| Table 10: NCS-R and NLAAS collapsed cells for the short sample, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS short | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtsh) |
| Cubans in AA 0-9.9 | 805628 | 435 | 805629 | 0.999998759 | 1852.02069 |
| PR in AA 0-9.9 | 536439 | 156 | 536438 | 1.000001864 | 3438.705128 |
| Mexican in AA 0-9.9 | 11529509 | 1083 | 11529523 | 0.999998786 | 10645.91228 |
| Other His in AA 0-9.9 | 2592100 | 340 | 2576935 | 1.005884898 | 7579.220588 |
| Africocarib and AA in Africocarib and AA blocks | 21333840 | 1072 | 4726615 | 4.513555684 | 4409.155784 |
| Whites AA 0-9.9 | 118404822 | 5316 | 117237005 | 1.009961164 | 22053.61268 |
| Vit in Hawaiian blocks | 3403 | 2 | 3403 | 1 | 1701.5 |
| Flip in Hawaiian blocks | 421617 | 126 | 421617 | 1 | 3346.166667 |
| Chinese in Hawaiian blocks | 223468 | 69 | 223469 | 0.999995525 | 3238.681159 |
| other Asian in Hawaiian blocks | 360977 | 84 | 360977 | 1 | 4297.345238 |
| Cuban in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| PR in Hawaiian blocks | 37466 | 11 | 37466 | 1 | 3406 |
| Mexican in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Other Hisp in Hawaiian blocks | 2945 | 1 | 2945 | 1 | 2945 |
| Africocarib in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| AA in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Whites in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Table 11: NCS-R and NLAAS collapsed cells for the long sample, post-stratification adjustment factor and mean weights | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS long | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtlg) |
| Asians in Cuban blocks | 96816 | 14 | 96802 | 1.000144625 | 6914.428571 |
| Cubans in Cuban blocks | 103587 | 63 | 103587 | 1 | 1644.238095 |
| PR in Cuban blocks | 48478 | 23 | 48478 | 1 | 2107.73913 |
| Mexican and other His in Cuban blocks | 206292 | 42 | 201229 | 1.025160389 | 4791.166667 |
| Afrocarib and AA in Cuban blocks | 106254 | 9 | 37583 | 2.827182503 | 4175.888889 |
| Whites in Cuban blocks | 1086796 | 17 | 994958 | 1.092303394 | 58526.94118 |
| Vit in Vit blocks | 383349 | 208 | 383349 | 1 | 1843.024038 |
| Flip in Vit blocks | 94392 | 19 | 94392 | 1 | 4968 |
| Chinese and Other Asians in Vit blocks | 74000 | 13 | 74000 | 1 | 5692.307692 |
| Cuban, PR and Mexican in Vit blocks | 310144 | 30 | 310142 | 1.000006449 | 10338.06667 |
| Other Hispanics in Vit blocks | 110127 | 10 | 110127 | 1 | 11012.7 |
| Afrocarib and AA in Vit blocks | 70975 | 12 | 38540 | 1.84159315 | 3211.666667 |
| Whites in Vit blocks | 739518 | 23 | 739527 | 0.99998783 | 32153.34783 |
| Vit in Flip Blocks | 38422 | 17 | 38422 | 1 | 2260.117647 |
| Flip in Flip Blocks | 289714 | 83 | 289714 | 1 | 3490.53012 |
| Chinese in Flip Blocks | 240247 | 68 | 240247 | 1 | 3533.044118 |
| Other Asians in Flip blocks | 221116 | 34 | 221117 | 0.999995478 | 6503.441176 |
| Cuban and PR in Flip blocks | 44505 | 12 | 44505 | 1 | 3708.75 |
| Mexican and other Hisp in Flip block | 95635 | 10 | 95634 | 1.000010457 | 9563.4 |
| Africocarib and AA in Flip blocks | 243781 | 17 | 86858 | 2.806661447 | 5109.294118 |
| Whites in Flip blocks | 1210972 | 34 | 857780 | 1.411751265 | 25228.82353 |
| Vit and Flip in PR blocks | 51801 | 14 | 51801 | 1 | 3700.071429 |
| Chinese and other Asians in PR blocks | 139542 | 20 | 139542 | 1 | 6977.1 |
| Cuban and PR in PR blocks | 1019669 | 253 | 978632 | 1.041933025 | 3868.110672 |
| Mexican in PR blocks | 509666 | 41 | 509665 | 1.000001962 | 12430.85366 |
| other Hisp in PR blocks | 1109915 | 209 | 1073945 | 1.033493335 | 5138.492823 |
| Afrocarib and AA in PR blocks | 1697650 | 41 | 253143 | 6.70628854 | 6174.219512 |
| Whites in PR block | 2026304 | 70 | 1576027 | 1.285703862 | 22514.67143 |
| Vit in Chinese blocks | 55101 | 19 | 55101 | 1 | 2900.052632 |
| Flip in Chinese blocks | 25166 | 10 | 25165 | 1.000039738 | 2516.5 |
| Table 11: NCS-R and NLAAS Collapsed Cells for the long sample, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS long | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtlg) |
| Chinese in Chinese blocks | 541553 | 118 | 541553 | 1 | 4589.432203 |
| Other Asians in Chinese block | 247125 | 40 | 247124 | 1.000004047 | 6178.1 |
| Cuban, PR and Mexican in Chinese blocks | 94041 | 20 | 89041 | 1.056153907 | 4452.05 |
| Other Hispanics in Chinese blocks | 164478 | 21 | 164478 | 1 | 7832.285714 |
| Afrocarib and AA in Chinese blocks | 27500 | 14 | 27500 | 1 | 1964.285714 |
| Whites in Chinese blocks | 2191741 | 71 | 2191747 | 0.999997262 | 30869.67606 |
| Asians in Africocarib and AA 60-100 blocks | 90889 | 15 | 90889 | 1 | 6059.266667 |
| Cuban and PR in Africocarib and AA 60-100 | 169970 | 25 | 59887 | 2.83817857 | 2395.48 |
| Mexican in Africocarib and AA 60-100 | 282346 | 24 | 282346 | 1 | 11764.41667 |
| Other His in Africocarib and AA 60-100 | 277315 | 16 | 94583 | 2.931975091 | 5911.4375 |
| Africocarib and AA in Africocarib and AA blocks | 21333840 | 624 | 3091690 | 6.900381345 | 4954.63141 |
| Whites in Africocarib and AA 60-100 | 1944340 | 45 | 390324 | 4.981348828 | 8673.866667 |
| Asians in AA 30-59.9 | 134914 | 18 | 134914 | 1 | 7495.222222 |
| Cuban, PR and Mexicans in AA 30-59.9 | 82585 | 22 | 77172 | 1.07014202 | 3507.818182 |
| other Hisp in AA 30-59.9 | 203503 | 24 | 143650 | 1.416658545 | 5985.416667 |
| Whites in AA 30-59.9 | 6258727 | 173 | 1989094 | 3.146521482 | 11497.65318 |
| Vit in AA 10-29.9 | 179945 | 70 | 179945 | 1 | 2570.642857 |
| Flip in AA 10-29.9 | 389766 | 91 | 389766 | 1 | 4283.142857 |
| Chinese in AA 10-29.9 | 255197 | 60 | 255197 | 1 | 4253.283333 |
| Other Asian in AA 10-29.9 | 592307 | 81 | 592307 | 1 | 7312.432099 |
| Cubans in AA 10-29.9 | 149585 | 80 | 147739 | 1.012495008 | 1846.7375 |
| PR in AA 10-29.9 | 374734 | 92 | 366761 | 1.021738953 | 3986.532609 |
| Mexicans in AA 10-29.9 | 2942722 | 194 | 2942717 | 1.000001699 | 15168.64433 |
| Other Hisp in AA 10-29.9 | 759917 | 90 | 735404 | 1.033332699 | 8171.155556 |
| Whites in AA 10-29.9 | 18867372 | 600 | 13670503 | 1.380151996 | 22784.17167 |
| Vit in AA 0-9.9 | 437448 | 188 | 437448 | 1 | 2326.851064 |
| Flip in AA 0-9.9 | 674184 | 176 | 674183 | 1.000001483 | 3830.585227 |
| Chinese in AA 0-9.9 | 1039660 | 256 | 1039660 | 1 | 4061.171875 |
| Other Asian in AA 0-9.9 | 1749290 | 265 | 1749286 | 1.000002287 | 6601.079245 |
| Cubans in AA 0-9.9 | 805628 | 432 | 805629 | 0.999998759 | 1864.881944 |
| PR in AA 0-9.9 | 536439 | 142 | 536438 | 1.000001864 | 3777.732394 |
| Table 11: NCS-R and NLAAS collapsed cells for the long sample, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NCSR & NLAAS long | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (ncnlwtlg) |
| Mexican in AA 0-9.9 | 11529509 | 896 | 11529511 | 0.999999827 | 12867.75781 |
| Other His in AA 0-9.9 | 2592100 | 298 | 2574818 | 1.006711931 | 8640.328859 |
| Whites AA 0-9.9 | 118404822 | 3332 | 114255079 | 1.036319987 | 34290.2398 |
| Vit in Hawaiian blocks | 3403 | 2 | 3403 | 1 | 1701.5 |
| Flip in Hawaiian blocks | 421617 | 126 | 421617 | 1 | 3346.166667 |
| Chinese in Hawaiian blocks | 223468 | 69 | 223469 | 0.999995525 | 3238.681159 |
| other Asian in Hawaiian blocks | 360977 | 84 | 360977 | 1 | 4297.345238 |
| Cuban in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| PR in Hawaiian blocks | 37466 | 11 | 37466 | 1 | 3406 |
| Mexican in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Other Hisp in Hawaiian blocks | 2945 | 1 | 2945 | 1 | 2945 |
| Africocarib in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| AA in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Whites in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Table 12: NSAL and NLAAS collapsed cells, post-stratification adjustment factor and mean weights | |||||
|---|---|---|---|---|---|
| Collapsed groups for NSAL & NLAAS | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (nsnlwt) |
| Asian in Cuban blocks | 96816 | 14 | 76429 | 1.266744299 | 5459.214286 |
| Cubans in Cuban blocks | 103587 | 57 | 84349 | 1.228076207 | 1479.807018 |
| PR in Cuban blocks | 48478 | 18 | 33562 | 1.444431202 | 1864.555556 |
| Mexican and other His in Cuban blocks | 206292 | 33 | 151908 | 1.358006162 | 4603.272727 |
| Afrocarib and AA in Cuban blocks | 106254 | 16 | 52701 | 2.016166676 | 3293.8125 |
| Whites in Cuban, PR and Asian blocks | 7255331 | 35 | 536943 | 13.51229274 | 15341.22857 |
| Vit in Vit blocks | 383349 | 207 | 381506 | 1.004830855 | 1843.024155 |
| Flip in Vit blocks | 94392 | 19 | 94392 | 1 | 4968 |
| Chinese and Other Asians in Vit blocks | 74000 | 13 | 60127 | 1.230728292 | 4625.153846 |
| Cuban, PR and Mexican in Vit blocks | 310144 | 17 | 143960 | 2.154376216 | 8468.235294 |
| Other Hispanics in Vit blocks | 110127 | 10 | 84713 | 1.30000118 | 8471.3 |
| Afrocarib and AA in Vit,, Flip, and Chinese blocks | 342256 | 39 | 155341 | 2.203256062 | 3983.102564 |
| Vit in Flip Blocks | 38422 | 17 | 38422 | 1 | 2260.117647 |
| Flip in Flip Blocks | 289714 | 80 | 279242 | 1.037501522 | 3490.525 |
| Chinese in Flip Blocks | 240247 | 67 | 236714 | 1.014925184 | 3533.044776 |
| other Asians in Flip blocks | 221116 | 32 | 181429 | 1.218746727 | 5669.65625 |
| Cuban, PR, Mexican and other Hisp in Flip blocks | 140140 | 17 | 121287 | 1.155441226 | 7134.529412 |
| Vit and Flip in PR blocks | 51801 | 13 | 48275 | 1.073039876 | 3713.461538 |
| Chinese and other Asians in PR blocks | 139542 | 16 | 125475 | 1.112109982 | 7842.1875 |
| Cuban and PR in PR blocks | 1019669 | 249 | 926172 | 1.100949932 | 3719.566265 |
| Mexican in PR blocks | 509666 | 34 | 393833 | 1.29411705 | 11583.32353 |
| other Hisp in PR blocks | 1109915 | 210 | 1049920 | 1.057142449 | 4999.619048 |
| Afrocarib in PR blocks | 126246 | 43 | 95237 | 1.325598244 | 2214.813953 |
| AA in PR blocks | 1571404 | 173 | 1181981 | 1.329466379 | 6832.260116 |
| Vit in Chinese blocks | 55101 | 19 | 55101 | 1 | 2900.052632 |
| Flip in Chinese blocks | 25166 | 9 | 22649 | 1.111130734 | 2516.555556 |
| Chinese in Chinese blocks | 541553 | 117 | 532451 | 1.017094531 | 4550.863248 |
| other Asians in Chinese block | 247125 | 34 | 171474 | 1.441180587 | 5043.352941 |
| Cuban, PR, Mexican, and other Hisp in Chinese blocks | 258519 | 25 | 144368 | 1.790694614 | 5774.72 |
| Table 12: NSAL and NLAAS collapsed cells, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NSAL & NLAAS | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (nsnlwt) |
| Asians in Africocarib and AA 60-100 blocks | 90889 | 13 | 64385 | 1.411648676 | 4952.692308 |
| Cuban and PR in Africocarib blocks | 84896 | 44 | 84897 | 0.999988221 | 1929.477273 |
| Mexican and other Hisp in Africocarib blocks | 111795 | 71 | 111796 | 0.999991055 | 1574.591549 |
| Africocarib in Africocarib blocks | 577382 | 933 | 577386 | 0.999993072 | 618.8488746 |
| AA in Africocarib blocks | 62549 | 12 | 62549 | 1 | 5212.416667 |
| Cuban in AA 60-100 | 16347 | 11 | 14984 | 1.090963695 | 1362.181818 |
| PR in AA 60-100 | 68727 | 23 | 58545 | 1.173917499 | 2545.434783 |
| Mexican in AA 60-100 | 282346 | 14 | 146402 | 1.92856655 | 10457.28571 |
| Other His in AA 60-100 | 165520 | 25 | 133484 | 1.239998801 | 5339.36 |
| Afrocarib in AA 60-100 | 288119 | 246 | 276861 | 1.040663004 | 1125.45122 |
| AA in AA 60-100 | 10596921 | 1840 | 8321916 | 1.273375146 | 4522.780435 |
| Asians in AA 30-59.9 | 134914 | 15 | 76687 | 1.759281234 | 5112.466667 |
| Cubans, PR, and Mexicans in AA 30-59.9 | 82585 | 17 | 23486 | 3.516350166 | 1381.529412 |
| Other His in AA 30-59.9 | 203503 | 28 | 142451 | 1.42858246 | 5087.535714 |
| Afrocarib in AA 30-59.9 | 220638 | 168 | 215511 | 1.023789969 | 1282.803571 |
| AA in AA 30-59.9 | 5110387 | 908 | 4135708 | 1.235674037 | 4554.744493 |
| Vit in AA 10-29.9 | 179945 | 68 | 172342 | 1.04411577 | 2534.441176 |
| Flip in AA 10-29.9 | 389766 | 89 | 373002 | 1.044943459 | 4191.033708 |
| Chinese in AA 10-29.9 | 255197 | 59 | 242849 | 1.050846411 | 4116.084746 |
| Other Asian in AA 10-29.9 | 592307 | 74 | 498077 | 1.189187616 | 6730.77027 |
| Cubans in AA 10-29.9 | 149585 | 80 | 145937 | 1.024997088 | 1824.2125 |
| PR in AA 10-29.9 | 374734 | 83 | 323989 | 1.156625688 | 3903.481928 |
| Mexicans in AA 10-29.9 | 2942722 | 160 | 2179794 | 1.350000046 | 13623.7125 |
| Other Hisp in AA 10-29.9 | 759917 | 84 | 625814 | 1.214285714 | 7450.166667 |
| Afrocarib in AA 10-29.9 | 198343 | 39 | 161155 | 1.230759207 | 4132.179487 |
| AA in AA 10-29.9 | 3545866 | 518 | 2612729 | 1.357150321 | 5043.878378 |
| Vit in AA 0-9.9 | 437448 | 185 | 430467 | 1.016217271 | 2326.848649 |
| Flip in AA 0-9.9 | 674184 | 172 | 647819 | 1.040698096 | 3766.389535 |
| Chinese in AA 0-9.9 | 1039660 | 249 | 999518 | 1.040161358 | 4014.128514 |
| Other Asian in AA 0-9.9 | 1749290 | 233 | 1273702 | 1.373390322 | 5466.532189 |
| Cubans in AA 0-9.9 | 805628 | 425 | 787108 | 1.023529173 | 1852.018824 |
| PR in AA 0-9.9 | 536439 | 117 | 402329 | 1.333334162 | 3438.709402 |
| Table 12: NSAL and NLAAS collapsed cells, post-stratification adjustment factor and mean weights (Continued) | |||||
|---|---|---|---|---|---|
| Collapsed groups for NSAL & NLAAS | CPS 2002 | Un-weighted Count | Count using CPES weights | Post-stratification adjustment factor | Mean weight (nsnlwt) |
| Mexican in AA 0-9.9 | 11529509 | 635 | 6760146 | 1.705511834 | 10645.89921 |
| Other His in AA 0-9.9 | 2592100 | 238 | 1803859 | 1.436974841 | 7579.239496 |
| Afrocarib and AA in AA 0-9.9 | 733635 | 73 | 243405 | 3.014050656 | 3334.315068 |
| Whites in Africocarib and AA blocks | 145475261 | 856 | 10259583 | 14.17945164 | 11985.49416 |
| Vit in Hawaiian blocks | 3403 | 2 | 3403 | 1 | 1701.5 |
| Flip in Hawaiian blocks | 421617 | 126 | 421617 | 1 | 3346.166667 |
| Chinese in Hawaiian blocks | 223468 | 69 | 223469 | 0.999995525 | 3238.681159 |
| Other Asian in Hawaiian blocks | 360977 | 84 | 360977 | 1 | 4297.345238 |
| Cuban in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| PR in Hawaiian blocks | 37466 | 11 | 37466 | 1 | 3406 |
| Mexican in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Other Hisp in Hawaiian blocks | 2945 | 1 | 2945 | 1 | 2945 |
| Africocarib in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| AA in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
| Whites in Hawaiian blocks | 0 | 0 | 0 | 0 | 0 |
To better understand weights in CPES, please consult our CPES Weights Chart and take a look at our FAQs on weights and complex design.
Alegria M, Takeuchi D, Canino G, Duan N, Shrout P, Meng X, Vega W, Zane N, Vila D, Woo M, Vera M, Guarnaccia P, Aguilar-Gaxiola S, Sue S, Escobar J, Lin K, Gong F. Considering context, space and culture: the National Latino and Asian American Study. IJMPR 2004; 13(4): 208-20.
Alegria M, Vila D, Woo M, Canino G, Takeuchi D, Vera M, Febo V, Guarnaccia P, Aguilar-Gaxiola S, Shrout P. Cultural relevance and equivalence in the NLAAS instrument: integrating etic and emic in the development of cross-cultural measures for a psychiatric epidemiology and services study of Latinos. IJMPR 2004; 13(4): 270-88.
Alegria, M., Takeuchi, D.T., Canino, G.,Duan, N.,Shrout, P.E., Vega, W., Zave, N., Guarnaccia, P., Aguilar-Gaxiola, Ver, M., Sue, S., Escobar, J., Lin, Keh-Ming, Jang, M. amd Gong, F. (2004). " Considering Context, Space and Culture: The National Latino and Asian American Study". International Journal of Methods in Psychiatric Research, Vol. 13, No.2, pp. 208-220.
American Association for Public Opinion Research. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. American Association for Public Opinion Research (AAPOR) standard: http://www.aapor.org , 2004.
Atrostic BK, Bates N, Burt G, Silberstein A. Non-response in US government household surveys: consistent measures, recent trends, and new insights. Journal of Official Statistics 2001; 17: 209-26.
Blaise(r) Survey Processing System: Version 4.5. Statistics Netherlands, 2000-2004.
Bogen K. The effect of questionnaire length on response rates - a review of the literature. Proceedings of the Section on Survey Research Methods, American Statistical Association 1996: 1020-5.
Bradburn NM. Respondent burden. Proceedings of the Section on Survey Research Methods. American Statistical Association 1978: 35-40.
Cannell C, Marquis K, Laurent A. A summary of studies. Vital Health Stat 1977; 2: 69.
Chambers, R.L. and Skinner C.J. (editiors). (2003). Analysis of Survey Data. JohnWiley and Sons, New York.
Cheung GQ, Liu Y. Displaying Chinese characters in Blaise. Proceedings of the Eighth International Blaise Users Conference 2003, Copenhagen, Denmark.
Cochran WG. Sampling Techniques, 3 edn. New York: John Wiley & Sons, 1977.
Cochran, W.G. (1977). Sampling Techniques. New York: John Wiley & Sons.
Corbin J, Morse JM. The unstructured interactive interview: issues of reciprocity and risks when dealing with sensitive topics. Qualitative Inquiry 2003; 9(3): 335-54.
de Leeuw E, de Heer W. Trends in household survey nonresponse: a longitudinal and international comparison. In R Groves, DA Dillman, JL Eltinge and RJA Little (eds) Survey Non-response. New York: Wiley, 2002, pp. 41-54.
Groeneveld R. Using non-Latin alphabets in Blaise. Proceedings of the Eighth International Blaise Users Conference 2003. Copenhagen, Denmark.
Groves RM, Couper MP. Non-response in Household Interview Surveys. New York: Wiley, 1998.
Groves RM, Fowler FJ, Couper MP, Lepkowski J, Singer E, Tourangeau R. Survey Methodology. New York: Wiley, 2004.
Groves RM. Survey Errors and Survey Costs. New York: John Wiley & Sons, 1989.
Guenzel PJ, Berckmans TR, Cannell CF. General Interviewing Techniques: A Self-Instructional Workbook for Telephone and Personal Interviewer Training. Ann Arbor, Michigan: Institute for Social Research, Survey Research Center, 1983.
Hansen MH, Hurwitz WN, Madow WG. Sample Survey Methods and Theory, Volumes I and II. New York: John Wiley & Sons, 1953.
Hansen MH, Hurwitz WN. The problem of nonresponse in surveys. Journal of the American Statistical Association 1946; 41: 517-29.
Hartley, H.O. (1962). "Multiple frame surveys." Proceedings of the Socials Science Section of the American Statistical Association Meeting, Minneapolis, Minnesota.
Hartley, H.O. (1974). "Multiple Frame Methodology and Selected Applications." Sankhya, Series C, 3, pp. 99-118.
Heeringa SG, Connor J, Darrah D. The 1980 SRC/NORC National Sample. Ann Arbor: Survey Methodology Program, Survey Research Center, University of Michigan, 1984.
Heeringa SG, Connor J, Redmond G. The 1990 SRC National Sample. Ann Arbor: Survey Methodology Program, Survey Research Center, University of Michigan, 1994.
Heeringa SG, Groves RM. Responsive Design for Household Surveys. Ann Arbor: Survey Methodology Program, Institute for Social Research, University of Michigan, 2004.
Heeringa SG, Liu J. Complex sample design effects and inference for mental health survey data. IJMPR 1997; 7(1): 56-65.
Heeringa SG, Wagner J, Torres M, Duan N, Adams T, Berglund P. Sample designs and sampling methods for the Collaborative Psychiatric Epidemiology Studies (CPES) IJMPR 2004; 13(4): 221-40.
Heeringa, S. , (2006). Technical Sample Design Documentation: National Study of American Life (NSAL). Technical Report. Statistical Design Group, Institute for Social Research, University of Michigan, Ann Arbor.
Heeringa, S., (2004). Technical Sample Design Documentation: 2002-2003 National Latino and Asian American Study (NLAAS). Technical Report. Statistical Design Group, Institute for Social Research, University of Michigan, Ann Arbor.
Heeringa, S., Wagner, J., Torres, M.,Duan, N., Adams, T., Berglund, P. (2004). "Sample designs and sampling methods for the Collaborative Epidemiology Studies (CPES)", International Journal of Methods in Psychiatric Research, Vol. 13, No. 4, pp. 221-240.
Heeringa, S.G. and Liu, J. (1997). "Complex sample design effects and inference for mental health survey data." International Journal of Methods in Psychiatric Research, Volume 7, Number 1, 56-65.
Henderson AS, Jorm AF. Do mental health surveys disturb? Psychol Med 1990; 20: 721-4.
Hess I. Sampling for social research surveys: 1947-1980. Ann Arbor: Institute for Social Research, University of Michigan, 1985.
Jackson J. Life in Black America. London: Sage Publications, 1991.
Jackson JS, Torres M, Caldwell CH, Neighbors HW, Nesse R, Taylor RJ, Trierweiler, SJ, Williams DR. The National Survey of American Life: a study of racial, ethnic and cultural influences on mental disorders and mental health. IJMPR 2004; 13(4): 196-207.
Jorm AF, Henderson AS, Scott R, MacKinnon AJ, Korten AE, Christensen H. Do mental health surveys disturb? Further evidence. Psychol Med 1994; 24: 233-7.
Kalton, G. (1977), "Practical methods for estimating survey sampling errors," Bulletin of the International Statistical Institute, Vol 47, 3, pp. 495-514.
Kessler R, Berglund P, Chiu WT, Demler O, Heeringa S, Hiripi E, Jin R, Pennell B, Walters E, Zaslavsky A, Zheng H. The US National Comorbidity Survey Replication (NCS-R): an overview of design and field procedures. IJMPR 2004; 13(2): 69-92.
Kessler R. The National Comorbidity Study of the United States. International Review of Psychiatry, 1994; 6: 365-76.
Kessler RC, Üstün TB. The World Mental Health (WMH) survey initiative version of the World Health Organization Composite International Diagnostic Interview (CIDI). IJMPR 2004; 13(2): 93-121.
Kessler RC, Wittchen H-U, Abelson J, Zhao S. Methodological issues in assessing psychiatric disorders with self-reports. In AA Stone, JS Turkkan, CA Barchrach, JB Jobe, HS Kurtzman and VS Cain (eds) The Science of Self-Report: Implications for Research and Practice. Mawah NJ: Lawrence Erlbaum Associates, 200; 229-55.
Kessler, R., Berglund, P., Chiu, W.T., Demler, O., Heeringa, S.,Hiripi, E., Jin, R.,Pennell, B., Walters, E., Zaslavsky, A., Zheng, H. (2004). "The U.S. National Comorbidity Survey Replication (NCS-R): an overview of design and field procedures." International Journal of Methods in Psychiatric Research, Vol. 13, No.2, pp. 69-92.
Kish L. A procedure for the objective selection of the respondent within the household. Journal of the American Statistical Association 1949; 44: 380-7.
Kish L. Statistical Design for Research. New York: John Wiley & Sons, 1987.
Kish L. Survey Sampling. New York: John Wiley & Sons, 1965.
Kish, L. (1965), Survey Sampling. New York: John Wiley & Sons, Inc.
Lessler JT, Kalsbeek WD. Nonsampling Errors in Surveys. New York: John Wiley & Sons, 1992.
Little, R.J.A. and Rubin, D.B. (2003). Statistical Analysis with Missing Data, 2nd Edition, John Wiley and Sons, New York.
Pennell, B.P., Bowers, A. Carr. D., Chardoul, S. Cheung, G-Q, Dinkelmann, K, Gebler, N., Hansen, S.E, Pennell, S. Torres, M. (2004). "The Development and Implementation of the National Comorbidity Survey Replication, the National Survey of American Life, and the National Latino and Asian American Survey", International Journal of Methods in Psychiatric Research, Vol. 13, No. 4, pp. 241-269.
Rao, J.N.K & Wu, C.F.J. (1988.), "Resampling inference with complex sample data," Journal of the American Statistical Association, 83, pp. 231-239.
Research Triangle Institute (2003). SUDAAN User's Manual, Release 9.0. Research Triangle Park, NC: Research Triangle Institute.
Rust, K. (1985). "Variance estimation for complex estimators in sample surveys," Journal of Official Statistics, Vol. 1, No. 4.
SAS Institute, Inc. (2003). SAS/STAT(R) User's Guide, Version 9, Cary, NC: SAS Institute, Inc.
Sharp LM, Frankel J. Respondent burden: a test of some common assumptions. Public Opinion Quarterly 1983; 47: 36-53.
Singer E, Groves RM, Corning AD. Differential incentives: beliefs about practices, perceptions of equity, and effects on survey participation. Public Opinion Quarterly 1999; 63: 251-60.
Singer E, Van Hoewyk J, Gebler N, Raghunathan T, McGonagle K. The effect of incentives in interviewermediated surveys. Journal of Official Statistics 1999; 15: 217-30.
Skinner, C.J., Holt, D., & Smith, T.M.F. (1989). Analysis of Complex Surveys. New York: John Wiley & Sons.
STATA Corp. (2004). STATA Statistical Software: Release 9.0. College Station, TX: STATA Corporation.
Turnbull JE, McLeod JD, Callahan JM, Kessler RC. Who should ask? Ethical interviewing in psychiatric epidemiology studies. American Journal of Orthopsychiatry 1988; 58(2): 228-39.
Westat, Inc. (2000). WesVar 4.0 User's Guide. Rockville, MD: Westat, Inc.
Wittchen H-U. Reliability and validity studies of the WHD Composite International Diagnostic Interview (CICDI): a critical review. J Psychiatr Res 1994; 28(1): 57-84.
Wolter, K.M. (1985 ). Introduction to Variance Estimation. New York: Springer-Verlag.
World Health Organization. Composite International Diagnostic Interview, Version 1.0. Geneva, Switzerland: World Health Organization, 1990.
World Health Organization. Composite International Diagnostic Interview, Version 2.1. Geneva, Switzerland: World Health Organization, 1997.