V. Integrated Weight for the Pooled CPES Data Set
Case-specific population weights had been developed for each CPES component survey (Kessler et al., 2004; Heeringa, 2004; Heeringa et al., 2005). Each project had carefully developed and refined its weight vector to enable robust probability sampling inference ("design-based") to its chosen survey population. NCS-R was unique among the three component studies in that it required two final analysis weights--one weight for the full sample of cases who participated in the Part 1 interview and a second for the subsample of cases that also completed Part 2 of the NCS-R. Consequently, the CPES combined data set also has two analysis weights--the first for analysis of common data items and the second for analysis of survey items that NCS-R only administered to Part 2 respondents.
The integrated weight development began with the existing final population weights for the NCS-R, NSAL, and NLAAS. The integrated weight development then proceeded according to the following steps:
Step 1. Each NLAAS, NSAL, and NCS-R case was assigned to a race/ancestry category based on the categories and priority order provided in Table 1 (see Section III).
Step 2. Each NLAAS, NSAL, and NCS-R area segment was assigned to a geographic domain based on the definitions and priority order shown in Table 2 (see Section IV). Each NLAAS, NSAL, and NCS-R respondent was assigned to a geographic domain based on its area segment classification.
Step 3. The final population weight values for the three data sets were obtained from the NLAAS, NSAL, and NCS-R investigators. Since the final NCS-R and NSAL weights had been "centered" or "normalized" (mean weight=1.0), they were restored to the original U.S. population scaling based on weighted totals from the March 2002 demographic supplement of the Current Population Survey ( CPS).
Step 4. Notation: each case in the CPES pooled data set was indexed as follows:
|Table 3. Subscript notation for weight integration expressions|
|i||1,...,n||Individual sample case subscript|
|j||1,2,3||Study index, 1=NCS-R; 2=NSAL, 3=NLAAS|
|k||1-11||Population index (Table 1), collapsing White, Other|
|l||1-11||Domain index (Table 2)|
Step 5. From the pooled data set, EXCEL spreadsheets were used to compute the sums of nominal cases for each study by race/ancestry population by geographic domain cell. These counts were then aggregated across the three studies to produce CPES pooled case counts for each population x domain cell:
Step 6. The March 2002 CPS data enabled estimation of post-stratification control totals for each race/ancestry group, k=1,...,11; however, it did not provide the geographic detail needed to allocate the population total to the l=1,...,11 geographic domains. For this purpose, the weighted population distribution from the CPES study with the most robust estimates of geographic distribution was used. NLAAS was chosen as the basis for allocating the Asian and Hispanic populations to the 11 sample geographic domains. NSAL weighted sample distributions were used to apportion the African-American and Afro-Caribbean populations to the geographic domains. White and Other population totals were allocated to geographic domains based on the empirical distribution of weights in the NCS-R.
= the CPES control total for race/ethnicity population k and domain l,
= the original study-specific weight for case i, study,
= the CPES study chosen to estimate the domain allocation for population k, and
= the March 2002 CPS population estimate for race/ethnicity category k.
Table 4 provides the final population controls for the race/ethnicity x domains cells of the CPES weight computation array.
Step 7: The original population weights from each study were post-stratified to the common race/ethnicity x domain population control totals derived from the March 2002 CPS (see Step 6 and Table 4).
= the study specific weight adjusted to 2002 CPS population totals,
= the original study-specific population weight for case i, study j,
= the 2002 CPS estimate for race/ethnicity population k allocated to domain l.
Since the original study-specific weights for the major populations of interest had already included some form of population-based control, this rescaling to a common post-stratification standard did not require major adjustments.
Step 8. Since in Step 7 the individual study-specific weights were controlled to exact counts for each race/ancestry x geographic domain cell, the remaining step involved rescaling the study-specific weights to reflect the proportion of nominal cases that each study contributed to the cell in the pooled data set.
= the CPES population weight for case i;
= the standard population weight for case i, study j, (assigned to population k and domain l).
Conditional on the assigned population (k) and domain classification (l), this rescaling provided a "proportionate to sample size" contribution from each study. It linearly rescaled the weights for each individual study. It did not alter the distribution of the study-specific population weights except to reduce the study specific mean by n+jkl/n++kl and the variance of the study weights (not relvariance) by a factor of (n+jkl/n++kl)2.
|Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 1 of 2)|
|Sample Frame Geographic Domain|
|WHITE AND OTHER||1086796||739518||1210972||2026304||2191741||1250|
|Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 2 of 2)|
|Sample Frame Geographic Domain|
|WHITE AND OTHER||1943090||6258727||18867372||118404822||0||152730592|