- No. Some questions were asked in one study and not in the others. Questions not asked of all respondents include the following:
- Age under 45: The Attention Deficit/Hyperactivity (AD), Oppositional Defiant Disorder (OD), and Conduct Disorder (CD) sections were administered only to individuals who were under the age of 45.
- The National Comorbidity Survey Replication (NCS-R) had two Parts: Part 1 included a core diagnostic assessment of all 9,282 respondents. Part 2 was administered only to 5,692 of the 9,282 Part 1 respondents, including all Part 1 respondents with a lifetime disorder plus a probability subsample of other respondents. See the Final Weights section of the User Guide section of About CPES for an explanation of the two parts and the implications for weighting analyses.
- The National Survey of American Life (NSAL) Black Caribbean sample had some questions that no one else had.
- The National Survey of American Life (NSAL) had several sections which were not administered to the White sample. See below for a list of sections skipped if the NSAL respondent was White.
Question Numbers Mental Disorder Administered to Whites DP1-DP88 Depression X M1-M54 Mania X PD1-PD66 Panic Disorder X SO1-SO40 Social Phobia X AG1-AG39 Agoraphobia X GA1-GA51b Generalized Anxiety Disorder X SD0-SD29 Suicidality SU1-SU120b Alcohol and Other Substance Abuse and Dependence PH1-PH175 Pharmacoepidemiology PEA40-PEA83 Personality Disorders PT1-PT281 Post-Traumatic Stress Disorder NSD1-NSD2 30-Day Symptoms TB1 Tobacco Use EA1-EA43 Eating Disorders: Anorexia and Bulimia PR1-PR19a Pre-Menstrual Dysphoric Disorder O1-O17 Obsessive-Compulsive Disorder PS1-PS10 Psychosis Screen GM1-GM6 Gambling FH1-FH39 Family History AD1-AD51 Attention-Deficit/Hyperactivity Disorder OD1-OD27 Oppositional Defiant Disorder CD1-CD40 Conduct Disorder SA1-SA50b Separation Anxiety SR1-SR135 Services
The Latino sample in the NSAL is part of the Afro-Caribbean sample. Therefore, NSAL required that all Latinos in the NSAL self-report as being of Black race and of Caribbean descent. The Latino sample in the NLAAS could have self-reported being of any race.
Yes, they are different in one very important aspect: The NCS-R White sample is representative of non-Latino Whites in the U.S., whereas the NSAL Whites are representative of non-Latino Whites in the U.S who live in households located in census tracts and block groups with a 10% or greater African American population. NSAL Whites are unique in that their selection rate was based on the African American distribution, that is, their probability of selection increased as the density of African Americans increased in the block group. This NSAL White sample was designed to be optimal for comparative analyses in which residential, environmental, and socioeconomic characteristics are controlled.
- There is a race/ethnicity variable in CPES. It does not contain the number of Native Americans because of possible confidentiality issues. The restricted-use version of NCS-R includes this information but the actual number of respondents in this category may be insufficient to do any meaningful analyses.
- The 2001?2 NCS survey program included two major survey components: the NCS-R replication study and the National Comorbidity Survey-2 reinterview study. The NCS-2 was a longitudinal study that attempted to recontact and reinterview all surviving respondents from the 1992 National Comorbidity Survey (Kessler et al., 1994). The focus of the discussion here is on the NCS-R, a new cross-sectional sample survey of the U.S. adult population. NCS and NCS-R are indeed two completely separate cross-sectional surveys that cannot be linked
According to NSAL variable ?RANCEST? there are 1,438 Afro Caribbeans and 183 Hispanics; however, the literature published on the NSAL reports 1621 Afro Caribbeans, they include the 183 Hispanics. The RANCEST recode is described in the Data Processing Notes on CPES Web site. The 183 individuals labeled as ?All Other Hispanics? comprise black Caribbeans from the British Virgin Islands, Guadeloupe, the Dominican Republic, Panama, Costa Rica, Nicaragua, and Honduras.
- You may do this by choosing a variable that only respondents who answered Part 2 completed and then do a cross tabulation by the variable ?sex?. For example, all 5,692 participants answered Question DA39 (Is biological mother still living?). When this variable is cross tabulated with the variable ?sex?, the distribution is 2,382 males and 3,310 females (unweighted). You may want to try other variables from the long form in similar crosstabs to check the gender breakdown.
The CPES represents the English-speaking non-Hispanic White, African American, and Caribbean Black populations; the English- and Spanish-speaking Mexican, Puerto Rican, and Cuban populations; and the English-, Tagalog-, Vietnamese-, and Chinese-speaking Chinese, Filipino, and Vietnamese populations of the United States. Individuals of "other" race/ethnic background and "other" Latino and Asian ancestry were also included, but U.S. population representation of these groups was not possible.
In making our population projections, we have used a population estimate of 209,128,094 million people. This is the number of people ages 18+ in the US in 2001 based on published Census data. The majority of all respondents were interviewed in 2001 and therefore this is the best population we can use for our projections. We did not pull out homeless or institutionalized people or people who don't speak English, all of whom were excluded from our sample. These people probably make up about 5% of the population. We don't want to reify our rough estimate of population size, though, so you should feel free to use another estimate. For example, you might want to average the Census population estimates over the years of the survey and/or adjust for the exclusion of the non-household population.
- In the Data Collection section of About CPES, Table 8 shows for each survey the number of interviews, response rate, interview length (minutes), and average number of contacts per interview.The response rates for each of each component study of the CPES were:
- The National Comorbidity Survey Replication (NCS-R) response rate was 70.9%.
- The National Survey of American Life (NSAL) response rate was 72.3% overall and 70.7% for African Americans, 77.7% for Caribbean Blacks, and 69.7% for Whites.
- The National Latino and Asian American Study (NLAAS) response rate was 75.5% for the Latino surveys and 65.6% for the Asian surveys.
- Weighted and unweighted frequencies and an example of an NCSR subpopulation analysis.
NCS-R adopts the practice of normalizing its analysis weights. This is accomplished by multiplying the original population scale weight by a factor of k=n/sum(population weights for all sample cases). Analysis results for all statistics of interest (except population totals) do not change under this or any other linear scaling of the weights.
In any weighted data set, the weighted frequencies are meaningless because the data producer can scale these values to any total value that they choose. The common practice of normalizing weights, used by NCS-R, scales the sum of the analysis weights to the unweighted sample size, n. The observation that weighted counts are less than unweighted counts suggests to me that the population weights for the cases included in this analysis were on average less than the average of weights for the full sample. In short, this is a good observation and it was an important question to ask. I recommend that analysts always produce tables of statistics that show the unweighted n, the weighted estimate of the statistics of interest, and estimated standard errors or confidence intervals that reflect the effect of weighting as well as the design stratification and clustering on the sampling variance of estimates.
It is possible that the weighted frequency for a given cell might be lower than the unweighted cell since the weights are non-integer and some are less than 1. Below is a SAS PROC MEANS analysis of the weight for the part 2 NCSR data as an illustration of how this happens but overall, the sum of weights is still n=5692. (Part 2 of the NCSR).
The example also demonstrates how to correctly analyze a subpopulation of those with 12 MDDH in the NCSR data set. An implied domain statement is used in PROC SURVEYFREQ of SAS v9.2 for analysis of the selected subpopulation. The way this is accomplished is to use the domain variable as the first variable in the tables statement of PROC SURVEYFREQ (see SAS v9.2 documentation for more details on PROC SURVEYFREQ and domain analysis).
* NOTE: V07655 IS THE DSM 12 MONTH MDDH INDICATOR VARIABLE FROM THE CPES DATA SET;
* THIS VARIABLE IS LISTED FIRST IN THE TABLES STATEMENT BELOW AND FUNCTIONS AS A DOMAIN VARIABLE;
* DATA USED IS THE FULL CPES DATA SET WITH THE NCSR PART 2 WEIGHT;
proc surveyfreq data=CPES;
STRATA SESTRAT ;
CLUSTER SECLUSTR ;
tables V07655*RANCEST /row chisq;
The SURVEYFREQ Procedure
Number of Strata 42
Number of Clusters 84
Number of Observations 20013
Number of Observations Used 5692
Number of Obs with Nonpositive Weights 14321
Sum of Weights 5692.0038
Table of V07655 by RANCEST
Weighted Std Dev of Std Err of
V07655 RANCEST Frequency Frequency Wgt Freq Percent Percent
1 4 10 5.65100 1.99781 0.0993 0.0363
8 62 37.33360 6.56090 0.6559 0.1217
10 71 35.42600 5.15569 0.6224 0.1006
11 474 289.57490 25.29687 5.0874 0.3079
12 29 14.54000 3.34056 0.2554 0.0549
Total 646 382.52550 25.57003 6.7204 0.3112
5 4 73 88.97570 14.76961 1.5632 0.2744
8 465 592.67140 60.34542 10.4124 1.1330
10 646 668.49300 53.64654 11.7444 0.9919
11 3706 3852 234.29613 67.6654 1.6432
12 156 107.82340 13.40303 1.8943 0.2469
Total 5046 5309 233.24960 93.2796 0.3112
Total 4 83 94.62670 15.56364 1.6624 0.2909
8 527 630.00500 61.94072 11.0682 1.1746
10 717 703.91900 54.99826 12.3668 1.0356
11 4180 4141 255.08542 72.7528 1.8167
12 185 122.36340 14.66705 2.1497 0.2628
Total 5692 5692 251.09597 100.000
PROC MEANS DATA=CPES;
CLASS V07655 RANCEST;
WHERE NCSRWTLG NE .; *SELECT ONLY THOSE WITH NON-MISSING ON THE NCSR PART 2 WEIGHT ;
The MEANS Procedure
Analysis Variable : NCSRWTLG NCSR sample part 2 weight
(12Mo) Ancestry N Obs N Mean
1 4 10 10 0.5651000
8 62 62 0.6021548
10 71 71 0.4989577
11 474 474 0.6109175
12 29 29 0.5013793
5 4 73 73 1.2188452
8 465 465 1.2745622
10 646 646 1.0348189
11 3706 3706 1.0392647
12 156 156 0.6911756
- The National Comorbidity Survey Replication (NCS-R) Part 1 included a core diagnostic assessment of all 9,282 respondents and Part 2 was administered only to 5,692 of the 9,282 Part 1 respondents, including all Part 1 respondents with a lifetime disorder plus a probability subsample of other respondents. See NCS-R Part 1 and 2 Sample for more information. See Final Weights and Special Analysis Considerations for Weighted Analysis in the Weighting section of About CPES for an explanation of the two parts and the implications for weighting analyses.
Since the NCS-R obtained rich data on White respondents, and due to the high cost of obtaining interviews, it was felt that it was not necessary to obtain additional data on White respondents in the NSAL. The data on the NCS-R Whites is available now that the data have been merged.
- There are couple reasons why the number of cases will not be consistent across all variables:
- Some questions were asked in one study and not in the others. See "Are all questions asked of all respondents?."
- Several different versions of the instrument were used in the component surveys, with the result that not all respondents answered every question. Thus, the number of cases will not be consistent across all variables. In terms of questionnaire versions, NCS-R had 15, NSAL had 14, and NLAAS had 8.
- When using the NSAL weighted data using the NSALWTCT weight, it centers the sample size to the proportion that the three race groups exist in NSAL?s sample population, that is, where African Americans live plus where Caribbeans live and Whites in areas where 10% or more of the population is Black. Therefore, when these weights are applied, the NSAL White weighted sample size increases and the two Black samples decrease.
In the past when most analysts did not correct the standard errors for complex survey design, this posed analytical challenges since analyses assuming a simple random sample used the number of cases to determine the significance of the analyses (chi-squares, regressions, etc). The common solution was to center the weights to each of the race?s sample size in order to have a large enough sample size by race when looking at race differences. But with complex design correction, the numbers of strata and clusters are used instead the number of cases to determine significance. Therefore, the small N by race does not affect the analyses. Using the NSAL Population weight NSALWTPN instead of NSALWTCT will result in very large population N?s but the same analyses results.