National Survey on Drug Use and Health, 2004 (ICPSR 4373)
The National Survey on Drug Use and Health (NSDUH) series (formerly titled National Household Survey on Drug Abuse) measures the prevalence and correlates of drug use in the United States. The surveys are designed to provide quarterly, as well as annual, estimates. Information is provided on the use of illicit drugs, alcohol, and tobacco among members of United States households aged 12 and older. Questions included age at first use as well as lifetime, annual, and past-month usage for the following drug classes: marijuana, cocaine (and crack), hallucinogens, heroin, inhalants, alcohol, tobacco, and nonmedical use of prescription drugs, including pain relievers, tranquilizers, stimulants, and sedatives. The survey covered substance abuse treatment history and perceived need for treatment, and included questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied. The survey included questions concerning treatment for both substance abuse and mental health related disorders. Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, and needle-sharing. Questions introduced in previous administrations were retained in the 2004 survey, including questions asked only of respondents aged 12 to 17. These "youth experiences" items covered a variety of topics, such as neighborhood environment, illegal activities, drug use by friends, social support, extracurricular activities, exposure to substance abuse prevention and education programs, and perceived adult attitudes toward drug use and activities such as school work. Several measures focused on prevention-related themes in this section. Also retained were questions on mental health and access to care, perceived risk of using drugs, perceived availability of drugs, driving and personal behavior, and cigar smoking. Questions on the tobacco brand used most often were introduced with the 1999 survey and retained through the 2003 survey. Background information includes gender, race, age, ethnicity, marital status, educational level, job status, veteran status, and current household composition. In addition, in 2004 Adult and Adolescent Mental Health modules were added.
The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Studies. National Survey on Drug Use and Health, 2004. ICPSR04373-v4. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2015-11-23. http://doi.org/10.3886/ICPSR04373.v4
Persistent URL: http://doi.org/10.3886/ICPSR04373.v4
This study was funded by:
- United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Studies (283-98-9008)
Scope of Study
Subject Terms: addiction, alcohol, alcohol abuse, alcohol consumption, amphetamines, barbiturates, cocaine, controlled drugs, crack cocaine, demographic characteristics, depression (psychology), drinking behavior, drug abuse, drug dependence, drug treatment, drug use, drugs, hallucinogens, heroin, households, income, inhalants, marijuana, mental health, mental health services, methamphetamine, prescription drugs, sedatives, smoking, stimulants, substance abuse, substance abuse treatment, tobacco use, tranquilizers, youths
Geographic Coverage: United States
Universe: The civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters such as college dormitories, group homes, shelters, rooming houses, and civilians dwelling on military installations.
Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, North Carolina.
Prior to the 2002 survey, this series was titled National Household Surveys on Drug Abuse.
Although the design of the 2004 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the 2004 estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 new population data from the 2000 decennial Census became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
For selected variables, statistical imputation was performed following logical inference to replace missing responses. These variables are identified in the codebook as "...LOGICALLY ASSIGNED" for the logical procedure, or by the designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR." For each imputation-revised variable, a corresponding imputation indicator variable indicates whether a case's value on the variable resulted from an interview response or was imputed. Missing values for some demographic variables were imputed by the unweighted hot-deck technique used in previous surveys. Beginning in 1999, imputation of missing values for most variables was accomplished using predictive mean neighborhoods (PMN), a new procedure developed specifically for this survey. Both the hot-deck and PMN imputation procedures are described in the codebook.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. To further ensure respondent confidentiality, the data producer used data substitution and deletion of state identifiers and a subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from the variables in the public use file due to the disclosure protection procedures that were implemented.
The data definition and dictionary files for Stata are designed to be compatible with StataSE, Version 8. This is a large data file requiring that approximately 250 megabytes of Random Access Memory be allocated to Stata. Operations within Stata, including conversion of the ASCII data to Stata format, are likely to be slow. Analysts may wish to download subsets of data from the SAMHDA Data Analysis System (DAS) for use with Stata.
Since 1999, the survey sample has employed a 50-State design with an independent, multistage area probability sample for each of the 50 States and the District of Columbia.
Sample: A multistage area probability sample for each of the 50 states and the District of Columbia was used since 1999. The 2004 sample design is a continuation of the coordinated five-year sample design that was implemented for the 1999 through 2003 surveys. Although there is no overlap with the 1998 sample, the design facilitates overlap in the first-stage units (area segments) between each two successive years in the five-year design. The 2004 NSDUH continued the 50 percent overlap by retaining approximately half of the first-stage sampling units from the 2003 survey. This design increases the precision of estimates in year-to-year trend analysis. The sample is stratified on multiple levels, beginning with states. Eight states are considered large sample states and contribute approximately 3,600 respondents per state. The remaining states are sampled to yield 900 respondents per state. The second level of stratification divides states into Field Interviewer (FI) Regions. For the first stage of sampling, each FI region was partitioned into small geographic areas composed of adjacent Census blocks. These geographic clusters of blocks are referred to as segments, but were actually the primary sampling units (PSUs) for the coordinated five-year sample design. In advance of the survey period, specially trained listers had visited each area segment and listed all addresses for housing units and eligible group quarters units in a prescribed order. Systematic sampling was used to select the allocated sample of addresses from each segment. Each respondent who completed a full interview was given a $30 cash payment. Persons were selected from the address roster using a handheld computer. To improve the precision of estimates, the sample allocation process targeted five age groups: 12-17, 18-25, 26-34, 35-49, and 50 and older. The size measures used in selecting the area segments were coordinated with the dwelling unit and person selection process so that a nearly self-weighting sample could be achieved in each of the five age groups. The sample design included approximately equal numbers of persons in the 12-17, 18-25, and 26 and older age groups. The achieved sample for the 2004 NSDUH was 67,760 persons. The public use file contains 55,602 records due to a subsampling step used in the disclosure protection procedures. Minimum item response requirements were defined for cases to be retained for weighting and further analysis (i.e., "usable" cases). These requirements, as well as full sampling methodology, are detailed in the codebook.
Weight: The "basic sampling weights" are equal to the inverse of the probabilities of selection of sample respondents. To obtain "final NSDUH weights," the basic weights were adjusted to take into account dwelling unit-level and individual-level nonresponse and then further adjusted to ensure consistency with intercensal population projections from the United States Bureau of the Census. In the 2004 NSDUH, a split-sample design for respondents aged 18 or older was implemented. Thus in 2004, two additional person-level analysis weights other than ANALWT_C were created. They are SPDWT_C and DEPWT_C. These weights were created for specific types of person-level analyses. Depending on the section(s) of the 2004 survey from which the variable(s) originated, one of the three sampling weights must be selected and applied. Please refer to the Processor Notes in the codebook for details on determining the appropriate weight to use when analyzing a specific variable or combination of variables.
Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:
- Performed consistency checks.
- Created online analysis version with question text.
- Checked for undocumented or out-of-range codes.
Restrictions: Users are reminded by the United States Department of Health and Human Services that these data are to be used solely for statistical analysis and reporting of aggregated information and not for the investigation of specific individuals or treatment facilities.
Original ICPSR Release: 2006-05-12
- 2015-11-23 Covers for the PDF documentation were revised.
- 2013-06-21 Released Methodological Resources documentation and updated xml file to include variable groupings.
- 2013-02-06 The 2004 NSDUH public-use data file has been updated to include 26 new variables related to respondent drug use, mental health, and geography. Please view Table 4 of the codebook for more information on these variables.
- List all ~312 citations associated with this study
- View citations for the entire series
Most Recent Publications
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)
If you're looking for collection-level metadata rather than an individual metadata record, please visit our Metadata Records page.