mental health services,
substance abuse treatment,
Date of Collection:
Unit of Observation:
The civilian, noninstitutionalized population of the
United States aged 12 and older, including residents of
noninstitutional group quarters such as college dormitories, group
homes, shelters, rooming houses, and civilians dwelling on military
Data Collection Notes:
Data were collected and prepared for release by
Research Triangle Institute, Research Triangle Park, North Carolina.
Prior to the 2002 survey, this series was titled National
Household Surveys on Drug Abuse.
Although the design of the 2004 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the 2004 estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 new population data from the 2000 decennial Census became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
For selected variables, statistical
imputation was performed following logical inference to replace
missing responses. These variables are identified in the codebook as
"...LOGICALLY ASSIGNED" for the logical procedure, or by the
designation "IMPUTATION-REVISED" in the variable label when the
statistical procedure was also performed. The names of statistically
imputed variables begin with the letters "IR." For each
imputation-revised variable, a corresponding imputation indicator
variable indicates whether a case's value on the variable resulted
from an interview response or was imputed. Missing values for some
demographic variables were imputed by the unweighted hot-deck
technique used in previous surveys. Beginning in 1999, imputation of
missing values for most variables was accomplished using
predictive mean neighborhoods (PMN), a new procedure developed
specifically for this survey. Both the hot-deck and PMN imputation
procedures are described in the codebook.
To protect the privacy
of respondents, all variables that could be used to identify
individuals have been encrypted or collapsed in the public use
file. To further ensure respondent confidentiality, the data producer
used data substitution and deletion of state identifiers and a
subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from
the variables in the public use file due to the disclosure protection
procedures that were implemented.
The data definition and
dictionary files for Stata are designed to be compatible with StataSE,
Version 8. This is a large data file requiring that approximately 250
megabytes of Random Access Memory be allocated to Stata. Operations
within Stata, including conversion of the ASCII data to Stata format,
are likely to be slow. Analysts may wish to download subsets of data
from the SAMHDA Data Analysis System (DAS) for use with Stata.
Since 1999, the survey sample has employed a 50-State design with an independent, multistage area probability sample for each of the 50 States and the District of Columbia.
A multistage area probability sample for each of the 50
states and the District of Columbia was used since 1999. The 2004 sample design
is a continuation of the coordinated five-year sample design that was
implemented for the 1999 through 2003 surveys. Although there is no
overlap with the 1998 sample, the design facilitates overlap in the
first-stage units (area segments) between each two successive years in
the five-year design. The 2004 NSDUH continued the 50 percent overlap
by retaining approximately half of the first-stage sampling units from
the 2003 survey. This design increases the precision of estimates in
year-to-year trend analysis. The sample is stratified on multiple
levels, beginning with states. Eight states are considered large
sample states and contribute approximately 3,600 respondents per
state. The remaining states are sampled to yield 900 respondents per
state. The second level of stratification divides states into Field
Interviewer (FI) Regions. For the first stage of sampling, each FI
region was partitioned into small geographic areas composed of
adjacent Census blocks. These geographic clusters of blocks are
referred to as segments, but were actually the primary sampling units
(PSUs) for the coordinated five-year sample design. In advance of the
survey period, specially trained listers had visited each area segment
and listed all addresses for housing units and eligible group quarters
units in a prescribed order. Systematic sampling was used to select
the allocated sample of addresses from each segment. Each respondent
who completed a full interview was given a $30 cash payment. Persons
were selected from the address roster using a handheld computer. To
improve the precision of estimates, the sample allocation process
targeted five age groups: 12-17, 18-25, 26-34, 35-49, and 50 and
older. The size measures used in selecting the area segments were
coordinated with the dwelling unit and person selection process so
that a nearly self-weighting sample could be achieved in each of the
five age groups. The sample design included approximately equal
numbers of persons in the 12-17, 18-25, and 26 and older age
groups. The achieved sample for the 2004 NSDUH was 67,760 persons. The
public use file contains 55,602 records due to a subsampling step used
in the disclosure protection procedures. Minimum item response
requirements were defined for cases to be retained for weighting and
further analysis (i.e., "usable" cases). These requirements, as well
as full sampling methodology, are detailed in the codebook.
The "basic sampling weights" are equal to the inverse of the
probabilities of selection of sample respondents. To obtain "final
NSDUH weights," the basic weights were adjusted to take into account
dwelling unit-level and individual-level nonresponse and then further
adjusted to ensure consistency with intercensal population projections
from the United States Bureau of the Census. In the 2004 NSDUH, a
split-sample design for respondents aged 18 or older was implemented.
Thus in 2004, two additional person-level analysis weights other than
ANALWT_C were created. They are SPDWT_C and DEPWT_C. These weights
were created for specific types of person-level analyses. Depending on
the section(s) of the 2004 survey from which the variable(s)
originated, one of the three sampling weights must be selected and
applied. Please refer to the Processor Notes in the codebook for
details on determining the appropriate weight to use when analyzing a
specific variable or combination of variables.
Mode of Data Collection:
audio computer-assisted self interview (ACASI),
computer-assisted personal interview (CAPI)
The study yielded a weighted screening response rate
of 91 percent and a weighted interview response rate for the Computer
Assisted Interview (CAI) of 77 percent.
Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of
disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major
statistical software formats as well as standard codebooks to accompany the data. In addition to
these procedures, ICPSR performed the following processing steps for this data collection:
Performed consistency checks.
Created online analysis version with question text.
Checked for undocumented or out-of-range codes.
Restrictions: Users are reminded by the United States Department of
Health and Human Services that these data are to be used solely for
statistical analysis and reporting of aggregated information and not for
the investigation of specific individuals or treatment facilities.