CPES logoCollaborative Psychiatric
Epidemiology Surveys

V. Integrated Weight for the Pooled CPES Data Set

Case-specific population weights had been developed for each CPES component survey (Kessler et al., 2004; Heeringa, 2004; Heeringa et al., 2005). Each project had carefully developed and refined its weight vector to enable robust probability sampling inference ("design-based") to its chosen survey population. NCS-R was unique among the three component studies in that it required two final analysis weights--one weight for the full sample of cases who participated in the Part 1 interview and a second for the subsample of cases that also completed Part 2 of the NCS-R. Consequently, the CPES combined data set also has two analysis weights--the first for analysis of common data items and the second for analysis of survey items that NCS-R only administered to Part 2 respondents.

The integrated weight development began with the existing final population weights for the NCS-R, NSAL, and NLAAS. The integrated weight development then proceeded according to the following steps:

Step 1. Each NLAAS, NSAL, and NCS-R case was assigned to a race/ancestry category based on the categories and priority order provided in Table 1 (see Section III).

Step 2. Each NLAAS, NSAL, and NCS-R area segment was assigned to a geographic domain based on the definitions and priority order shown in Table 2 (see Section IV). Each NLAAS, NSAL, and NCS-R respondent was assigned to a geographic domain based on its area segment classification.

Step 3. The final population weight values for the three data sets were obtained from the NLAAS, NSAL, and NCS-R investigators. Since the final NCS-R and NSAL weights had been "centered" or "normalized" (mean weight=1.0), they were restored to the original U.S. population scaling based on weighted totals from the March 2002 demographic supplement of the Current Population Survey ( CPS).

Step 4. Notation: each case in the CPES pooled data set was indexed as follows:

Table 3. Subscript notation for weight integration expressions
Index Subscript Values Representing
i 1,...,n Individual sample case subscript
j 1,2,3 Study index, 1=NCS-R; 2=NSAL, 3=NLAAS
k 1-11 Population index (Table 1), collapsing White, Other
l 1-11 Domain index (Table 2)

Step 5. From the pooled data set, EXCEL spreadsheets were used to compute the sums of nominal cases for each study by race/ancestry population by geographic domain cell. These counts were then aggregated across the three studies to produce CPES pooled case counts for each population x domain cell:

Step 6. The March 2002 CPS data enabled estimation of post-stratification control totals for each race/ancestry group, k=1,...,11; however, it did not provide the geographic detail needed to allocate the population total to the l=1,...,11 geographic domains. For this purpose, the weighted population distribution from the CPES study with the most robust estimates of geographic distribution was used. NLAAS was chosen as the basis for allocating the Asian and Hispanic populations to the 11 sample geographic domains. NSAL weighted sample distributions were used to apportion the African-American and Afro-Caribbean populations to the geographic domains. White and Other population totals were allocated to geographic domains based on the empirical distribution of weights in the NCS-R.


= the CPES control total for race/ethnicity population k and domain l,

= the original study-specific weight for case i, study,

= the CPES study chosen to estimate the domain allocation for population k, and

= the March 2002 CPS population estimate for race/ethnicity category k.

Table 4 provides the final population controls for the race/ethnicity x domains cells of the CPES weight computation array.

Step 7: The original population weights from each study were post-stratified to the common race/ethnicity x domain population control totals derived from the March 2002 CPS (see Step 6 and Table 4).


= the study specific weight adjusted to 2002 CPS population totals,

= the original study-specific population weight for case i, study j,

= the 2002 CPS estimate for race/ethnicity population k allocated to domain l.

Since the original study-specific weights for the major populations of interest had already included some form of population-based control, this rescaling to a common post-stratification standard did not require major adjustments.

Step 8. Since in Step 7 the individual study-specific weights were controlled to exact counts for each race/ancestry x geographic domain cell, the remaining step involved rescaling the study-specific weights to reflect the proportion of nominal cases that each study contributed to the cell in the pooled data set.


= the CPES population weight for case i;

= the standard population weight for case i, study j, (assigned to population k and domain l).

Conditional on the assigned population (k) and domain classification (l), this rescaling provided a "proportionate to sample size" contribution from each study. It linearly rescaled the weights for each individual study. It did not alter the distribution of the study-specific population weights except to reduce the study specific mean by n+jkl/n++kl and the variance of the study weights (not relvariance) by a factor of (n+jkl/n++kl)2.

Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 1 of 2)
Sample Frame Geographic Domain
(see text)
VIETNAMESE 2480 383349 38422 41222 55101 0
FILIPINO 9950 94392 289714 10579 25166 0
CHINESE 48732 54578 240247 132608 541553 0
OTHER ASIAN 35654 19422 221116 6934 247125 0
CUBAN 103587 3643 5000 10334 5000 6447
PUERTO RICAN 48478 19246 39505 1009335 19457 78449
MEXICAN 3750 287255 70701 509666 69584 0
OTHER HISPANIC 202542 110127 24934 1109915 164478 111795
AFRO-CARIBBEAN 3687 2500 1250 126246 0 577382
AFRIC-AMERICAN 102567 68475 242531 1571404 27500 62549
WHITE AND OTHER 1086796 739518 1210972 2026304 2191741 1250
Total 1648223 1782505 2384392 6554547 3346705 837872
Table 4: Standardized Population Control Totals for CPES Weights Based on March 2002 Current Population Survey (Part 2 of 2)
Sample Frame Geographic Domain
VIETNAMESE 24555 4348 179945 437448 3403 1170273
FILIPINO 11442 27032 389766 674184 421617 1953842
CHINESE 10720 50153 255197 1039660 223468 2596916
OTHER ASIAN 44172 53381 592307 1749290 360977 3330378
CUBAN 16347 11412 149585 805628 0 1116983
PUERTO RICAN 68727 3229 374734 536439 37466 2235065
MEXICAN 282346 67944 2942722 11529509 0 15763477
OTHER HISPANIC 165520 203503 759917 2592100 2945 5447776
AFRO-CARIBBEAN 288119 220638 198343 12119 0 1430284
AFRIC-AMERICAN 10596921 5110387 3545866 721516 0 22049716
WHITE AND OTHER 1943090 6258727 18867372 118404822 0 152730592
Total 13451959 12010754 28255754 138502715 1049876 209825302