mental health services,
substance abuse treatment,
Date of Collection:
Unit of Observation:
The civilian, noninstitutionalized population of the
United States aged 12 and older, including residents of
noninstitutional group quarters such as college dormitories, group
homes, shelters, rooming houses, and civilians dwelling on military
Data Collection Notes:
Users are advised to review the errata file prior
to conducting any analyses.
Data were collected and prepared for
release by Research Triangle Institute, Research Triangle Park,
The National Household Survey on Drug Abuse survey administration and sample design changed with the implementation of the 1999 survey. Since 1999, the survey sample has employed a 50-State design with an independent, multistage area probability sample for each of the 50 States and the District of Columbia. Therefore, estimates produced from the 1999, 2000, and 2001 surveys are not comparable to those produced from the 1998 and earlier surveys.
For selected variables, statistical imputation was
performed following logical inference to replace missing
responses. These variables are identified in the codebook as
"...LOGICALLY ASSIGNED" for the logical procedure, or by the
designation "IMPUTATION-REVISED" in the variable label when the
statistical procedure was also performed. The names of statistically
imputed variables begin with the letters "IR". For each
imputation-revised variable there is a corresponding imputation
indicator variable that indicates whether a case's value on the
variable resulted from an interview response or was imputed. Missing
values for some demographic variables were imputed by the unweighted
hot-deck technique used in previous NHSDAs. Beginning in 1999,
imputation of missing values for many other variables was accomplished
using predictive mean neighborhoods (PMN), a new procedure developed
specifically for the NHSDA. Both the hot-deck and PMN imputation
procedures are described in the codebook.
To protect the privacy
of respondents, all variables that could be used to identify
individuals have been encrypted or collapsed in the public use file.
To further ensure respondent confidentiality, the data producer used
data substitution and deletion of state identifiers and a subsample of
records in the creation of the public use file.
published estimates may not be exactly reproducible from the variables
in the public use file due to the disclosure protection procedures
that were implemented.
The data definition and dictionary files
for Stata are designed to be compatible with StataSE, Version 8. This
is a large data file requiring that approximately 250 megabytes of
Random Access Memory be allocated to Stata. Operations within Stata,
including conversion of the ASCII data to Stata format, are likely to
be slow. Analysts may wish to download subsets of data from the
SAMHDA Data Analysis System (DAS) for use with Stata.
A multistage area probability sample for each of the 50
states and the District of Columbia was used since 1999. A coordinated five-year sample
design was developed for 1999 through 2003. Although there is no
overlap with the 1998 sample, the design facilitated an overlap in
first-stage units (area segments) between each two successive years in
the five-year design. This design was intended to increase the
precision of estimates in year-to-year trend analyses because of the
expected positive correlation resulting from the overlapping
sample. To obtain the required precision at the state level and to
improve the precision of cigarette brand data for youths at the
national level, youths and young adults were oversampled. The result
was that each state's sample was approximately equally distributed
among three major age groups: 12 to 17 years, 18 to 25 years, and 26
years or older. The achieved sample for the 2000 computer-assisted
interview (CAI) sample was 71,764 persons. The public use file has
58,680 records due to the subsampling step used in the disclosure
protection procedures. Minimum item response requirements were defined
for cases to be retained for weighting and further analysis (i.e.,
"usable" cases). These requirements, as well as full sampling
methodology, are detailed in the codebook.
Due to unequal selection probabilities at multiple stages of sample selection and various adjustments, such as those for nonresponse
and poststratification, the 2000 NHSDA sample is not self-weighting.
Analysts are advised to use the sample weight when attempting to use
the NHSDA data to draw inferences about the target population or any
subdomain of the target population. All estimates published in SAMHSA
reports (such as the Summary of Findings from the 2000 NHSDA) are
weighted using the final analysis weight for the full sample (ANALWT). For the
public use file, the corresponding final sample weight is denoted as
ANALWT_C, C for confidentiality protection. This sample weight
represents the total number of target population persons each record
on the file represents. Note that the sum of ANALWT_C, over all
records on the data file, represents an estimate of the total number
of people in the target population.
Mode of Data Collection:
audio computer-assisted self interview (ACASI),
computer-assisted personal interview (CAPI)
The study yielded a weighted screening response rate
of 93 percent and a weighted interview response rate for the CAI of 74
Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of
disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major
statistical software formats as well as standard codebooks to accompany the data. In addition to
these procedures, ICPSR performed the following processing steps for this data collection:
Performed consistency checks.
Created online analysis version with question text.
Checked for undocumented or out-of-range codes.
Restrictions: Users are reminded by the United States Department of
Health and Human Services that these data are to be used solely for
statistical analysis and reporting of aggregated information and not
for the investigation of specific individuals or treatment