National Survey on Drug Use and Health, 2008 (ICPSR 26701)
Alternate Title: NSDUH 2008
Principal Investigator(s): United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Studies
The National Survey on Drug Use and Health (NSDUH) series (formerly titled National Household Survey on Drug Abuse) primarily measures the prevalence and correlates of drug use in the United States. Detailed NSDUH 2008 documentation is available from SAMHSA. The surveys are designed to provide quarterly, as well as annual, estimates. Information is provided on the use of illicit drugs, alcohol, and tobacco among members of United States households aged 12 and older. Questions included age at first use as well as lifetime, annual, and past-month usage for the following drug classes: marijuana, cocaine (and crack), hallucinogens, heroin, inhalants, alcohol, tobacco, and nonmedical use of prescription drugs, including pain relievers, tranquilizers, stimulants, and sedatives. The survey covered substance abuse treatment history and perceived need for treatment, and included questions from the Diagnostic and Statistical Manual (DSM) of Mental Disorders that allow diagnostic criteria to be applied. The survey included questions concerning treatment for both substance abuse and mental health related disorders. Respondents were also asked about personal and family income sources and amounts, health care access and coverage, illegal activities and arrest record, problems resulting from the use of drugs, and needle-sharing. Questions introduced in previous administrations were retained in the 2008 survey, including questions asked only of respondents aged 12 to 17. These "youth experiences" items covered a variety of topics, such as neighborhood environment, illegal activities, drug use by friends, social support, extracurricular activities, exposure to substance abuse prevention and education programs, and perceived adult attitudes toward drug use and activities such as school work. Several measures focused on prevention-related themes in this section. Also retained were questions on mental health and access to care, perceived risk of using drugs, perceived availability of drugs, driving and personal behavior, and cigar smoking. Questions on the tobacco brand used most often were introduced with the 1999 survey. For this 2008 survey, Adult mental health questions were added to measure symptoms of psychological distress in the worst period of distress that a person experienced in the past 30 days and suicidal ideation. A split-sample design also was included to administer separate sets of questions to assess impairment due to mental health problems. Background information includes gender, race, age, ethnicity, marital status, educational level, job status, veteran status, and current household composition.
These data are freely available.
WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Studies. National Survey on Drug Use and Health, 2008. ICPSR26701-v6. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2015-11-23. http://doi.org/10.3886/ICPSR26701.v6
Persistent URL: http://doi.org/10.3886/ICPSR26701.v6
This study was funded by:
- United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Office of Applied Studies (283-2004-00022)
Scope of Study
Subject Terms: addiction, alcohol, alcohol abuse, alcohol consumption, amphetamines, barbiturates, cocaine, controlled drugs, crack cocaine, demographic characteristics, depression (psychology), drinking behavior, drug abuse, drug dependence, drug treatment, drug use, drugs, employment, hallucinogens, health care, heroin, households, income, inhalants, marijuana, mental health, mental health services, methamphetamine, pregnancy, prescription drugs, sedatives, smoking, stimulants, substance abuse, substance abuse treatment, tobacco use, tranquilizers, youths
Geographic Coverage: United States
Date of Collection:
Unit of Observation: individual
Universe: The civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters such as college dormitories, group homes, shelters, rooming houses, and civilians dwelling on military installations.
Data Types: survey data
Data Collection Notes:
Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, North Carolina.
Since 1999, the survey sample has employed a 50-State design with an independent, multistage area probability sample for each of the 50 States and the District of Columbia.
Prior to the 2002 survey, this series was titled National Household Surveys on Drug Abuse.
Although the design of the 2008 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the 2008 estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 new population data from the 2000 decennial Census became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
For selected variables, statistical imputation was performed following logical inference to replace missing responses. These variables are identified in the codebook as "...LOGICALLY ASSIGNED" for the logical procedure, or by the designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR". For each imputation-revised variable, a corresponding imputation indicator variable indicates whether a case's value on the variable resulted from an interview response or was imputed. Missing values for some demographic variables were imputed by the unweighted hot-deck technique used in previous surveys. Beginning in 1999, imputation of missing values for most variables was accomplished using predictive mean neighborhoods (PMN), a new procedure developed specifically for this survey. Both the hot-deck and PMN imputation procedures are described in the codebook.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. To further ensure respondent confidentiality, the data producer used data substitution and deletion of state identifiers and a subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from the variables in the public use file due to the disclosure protection procedures that were implemented.
The setup and dictionary files for Stata are designed to be compatible with StataSE, Version 8. This is a large data file requiring that approximately 250 megabytes of Random Access Memory be allocated to Stata. Operations within Stata, including conversion of the ASCII data to Stata format, are likely to be slow. Analysts may wish to download subsets of data from the SAMHDA Survey Documentation and Analysis (SDA) system for use with Stata.
In the income section, which was interviewer-administered, a split-sample study had been embedded within the 2006 and 2007 surveys to compare a shorter version of the income questions with a longer set of questions that had been used in previous surveys. This shorter version was adopted for the 2008 NSDUH and will be used for future NSDUHs.
Sample: A multistage area probability sample for each of the 50 states and the District of Columbia has been used since 1999. The 2005 NSDUH was the first survey in a coordinated five-year sample design. Although there is no overlap with the 1999-2004 samples, the coordinated design for 2005 through 2009 facilitated a 50 percent overlap in second-stage units (area segments [see below]) between each two successive years from 2005 through 2009. This design was intended to increase precision of estimates in year-to-year trend analyses because of the expected positive correlation resulting from the overlapping sample between successive survey years. The 2008 design allows for computation of estimates by state in all 50 states plus the District of Columbia. States may therefore be viewed as the first level of stratification as well as a reporting variable. Eight states, referred to as the large sample states, had a sample designed to yield 3,600 respondents per state for the 2008 survey. This sample size was considered adequate to support direct state estimates. The remaining 43 states (which include the District of Columbia) had a sample designed to yield 900 respondents per state in the 2008 survey. In these 43 states, adequate data were available to support reliable state estimates based on SAE methodology. Within each state, sampling strata called state sampling (SS) regions were formed. Based on a composite size measure, states were partitioned geographically into roughly equal-sized regions. In other words, regions were formed such that each area yielded, in expectation, roughly the same number of interviews during each data collection period. The eight large sample states were divided into 48 SS regions each. The remaining states were divided into 12 SS regions each. Therefore, the partitioning of the United States resulted in the formation of a total of 900 SS regions. Unlike the 1999 through 2004 surveys, the first stage of selection for the 2005 through 2009 NSDUHs was Census tracts. The first stage of selection began with the construction of an area sample frame that contained one record for each Census tract in the United States. If necessary, Census tracts were aggregated within SS regions until each tract had, at a minimum, 150 dwelling units in urban areas and 100 dwelling units in rural areas. These Census tracts served as the primary sampling units (PSUs) for the coordinated five-year sample. One area segment (one or more Census blocks) was selected within each sampled Census tract. In advance of the survey period, specially trained listers had visited each area segment and listed all addresses for housing units and eligible group quarters units in a prescribed order. Systematic sampling was used to select the allocated sample of addresses from each segment. Beginning in 2002, each respondent who completed a full interview was given a $30 cash payment as a token of appreciation for his or her time. To improve the precision of the estimates, the sample allocation process targeted five age groups: 12 to 17 years, 18 to 25 years, 26 to 34 years, 35 to 49 years, and 50 years or older. The size measures used in selecting the area segments were coordinated with the dwelling unit and person selection process so that a nearly self-weighting sample could be achieved in each of the five age groups. The achieved sample size for the 2008 survey was 67,928 persons. The public use file contains 55,739 records due to a subsampling step used in the disclosure protection procedures. A key step in the data processing procedures established the minimum item response requirements in order for cases to be retained for weighting and further analysis (i.e., "usable" cases). These requirements, as well as full sampling methodology, are detailed in the codebook.
Due to unequal selection probabilities at multiple stages of sample selection and various adjustments, such as those for nonresponse and poststratification, the 2008 NSDUH sample design is not self-weighting. Analysts are advised to use the final sample weight when attempting to use the 2008 NSDUH data to draw inferences about the target population or any subdomains of the target population. All estimates published in SAMHSA reports (such as the results from the 2008 NSDUH) are weighted using the final analysis weight for the full sample (ANALWT). For the public use file, the corresponding final sample weight is denoted as ANALWT_C, with the "C" denoting confidentiality protection. This sample weight represents the total number of target population persons each record on the file represents. Note that the sum of ANALWT_C, over all records on the data file, represents an estimate of the total number of people in the target population.
In the mental health module of the 2008 NSDUH, the adult sample was split into sample A (MHSAMP08 = 1) who received the World Health Organization-Disability Assessment Scale (WHODAS) questions LIREMEM through LIAD68, and sample B (MHSAMP08 = 2) who received the Sheehan Disability Scale (SDS) questions MHAD66a through MHAD68. The mental health adult split-sample weight (MHSAWT_C) was created to accommodate analysis using either one of the split samples, and it is the product of person-level analysis weight (ANALWT_C) and a poststratification adjustment that was done separately for both sample A and sample B. The mental health adult split-sample weight for both sample A and sample B was controlled to the Census population estimates for the civilian, noninstitutionalized population aged 18 or older. It is noted that the MHSAWT_C was set to zero for all 12 to 17 year olds and the 10 adults who were not assigned to either of the split samples. The MHSAWT_C can be used for generating mental health estimates when using just sample A data or when using just sample B data separately. However, if mental health estimates are generated using both sample A and sample B combined data, then the person-level analysis weight (ANALWT_C) should be used.
Mode of Data Collection: audio computer-assisted self interview (ACASI), computer-assisted personal interview (CAPI), computer-assisted self interview (CASI)
Response Rates: Strategies for ensuring high rates of participation resulted in a weighted screening response rate of 88.62 percent and a weighted interview response rate for the CAI of 74.24 percent. (Note that these response rates reflect the original sample, not the subsampled data file referenced in this document.)
Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:
- Performed consistency checks.
- Standardized missing values.
- Created online analysis version with question text.
- Checked for undocumented or out-of-range codes.
Restrictions: Users are reminded that these data are to be used solely for statistical analysis and reporting of aggregated information and not for the investigation of specific individuals or treatment facilities.
Original ICPSR Release: 2009-11-16
- 2015-11-23 Covers for the PDF documentation were revised.
- 2014-09-05 Changed the Stata system data file from version 13 to version 12 for compatibility on a wider range of systems. Updated codebook to include correct frequencies from the data.
Since the release of the previous version of the 2008 Public Use Data File and Codebook a number of variables have been added. Some of the additional variables are revised versions of previous variables and replace the ones removed from the data file.
A model to predict adult mental illness was revised in the 2012 NSDUH to produce more accurate estimates. The mental illness variables included in the 2008 NSDUH data files are based on the revised 2012 model.
One new geographic variable(PDEN00), has been added to the data file to replace the previous PDEN variable. In this case, the only difference is the variable name. The variable has been renamed to indicate which census data was used in its construction.
- 2013-06-24 Released methodological resource documentation and updated xml file to include variable groupings.
- 2012-12-10 The 2008 NSDUH public-use data file has been updated to include 23 new variables related to non-medical drug usage and adult depression. Please view Table 4 of the codebook for more information on these variables.
- 2009-12-16 Corrections were made to misspelled variable labels for variables CPNPSYYR and CPNPSYMN.
Related Publications (?)
- List all ~328 citations associated with this study
- List ~15 citations that match your query
- View citations for the entire series
Top Publications that Match your Query
Browse Matching Variables
The following variable, YTHACT2, was created by counting the number of positive responses reported over the following 4 youth activity questions: School-based (YESCHACT), community-based (YECOMACT), church or faith-based (YEFAIACT), or other activities (YEOTHACT). Youth respondents who reported participation in 2 or more activities were included in the "2 or More Activities" category regardless of how many other activity questions were answered. Youth respondents who did not answer 2 or more of the 4 youth activity questions with a "yes" or "no" response were coded as SAS missing. To be included in the "None or 1 Activity" category, a youth respondent must have answered 2 or more of the activity questions and reported they participated in either zero or one activity.
The following variable, MDEIMPY, is derived from the maximum severity level of MDE role impairment (YSDSOVRL) and is restricted to adolescents with past year MDE (YMDEYR).
NOTE: The following five variables are named the same as in previous years since they were only defined for youth in previous years; therefore, no changes are needed for these in order to generate analyses across years. NOTE: The following four variables make up the four role domains (chores at home, school or work, close relationships with family, and social life) of the Sheehan Disability Scale (SDS), which measures the impact of a disorder on an adolescent's life.
NOTE: The variables in this section are recoded variables that were created from one or more of the edited variables from the preceding section. NOTE: Since 2005, all adult respondents aged 18 or older and adolescent (youth) respondents aged 12 to 17 have been administered separate age-appropriate depression modules designed to measure whether or not respondents had experienced a major depressive episode (MDE) in their lifetime and past year. Relevant data from both adults and adolescents were used to create recoded past year and lifetime MDE variables. Even though these MDE recodes used the relevant sections of the questionnaire for adults and adolescents, single variables were created for the entire population. For example, the lifetime MDE variable in the datasets prior to 2008 contained both the adolescent lifetime MDE recoded created using data from the adolescent section and the adult lifetime MDE recode created using data from the adult section. Even though the MDE data for adults and adolescents was contained in single combined variables, starting with the 2005 Detailed Tables, all MDE estimates were presented separately for adults and youths due to non-comparability stemming from wording difference between the two depression modules. In 2008, instead of a single variable containing the recoded MDE data for both adult and youth respondents, two recoded sets of MDE variables were created for adolescents and adults separately. These variables were assigned new variable names and included in the respective recoded depression sections of the codebook. The questions in both the adult and adolescent depression modules have not changed nor have the specifications used to create the MDE recodes. However, changes were implemented in the mental health module (see the Recoded Mental Health Module Variable Documentation Appendix) administered to adult respondents aged 18 or older. These changes and the fact that this revised mental health module precedes the adult depression module within the questionnaire, could have influenced the adult respondent's answer to subsequent depression-related items. Based on an analysis to determine the potential presence of context effects, it was determined that it was possible that adult MDE items and therefore MDE estimates were impacted by a context effect. As a result, the adult depression data are not considered comparable to prior years and therefore all adult depression recode variables starting in 2008 have been renamed to indicate the lack of comparability with prior years (see Recoded Adult Depression section). The youth depression data and associated MDE estimates were not impacted and are considered comparable to prior years. However, despite remaining comparable to prior years, the variables that shared the same variable names for both youth and adult depression data were assigned new names in 2008 since the adult data were no longer comparable with prior data. These renamed youth variables can be used in comparisons or analyses with data prior to 2008 by renaming the recoded adolescent depression variables the same across all the years. Youth depression variables that did not require renaming are noted specifically in the comments above the variables below. The 2008 Detailed Tables contain youth MDE estimates from 2004 through 2008. Specific details about these recoded variables and further information about context effects on the MDE estimates are provided in the Recoded Depression Variable Documentation Appendix. The following variable, YMDELT, classifies an adolescent as having a major depressive episode (MDE) in their lifetime (YMDELT=1) if they reported experiencing at least 5 out of the 9 criteria used to define an adolescent as having had MDE in their lifetime, where at least one of the criteria is a depressed mood or loss of interest or pleasure in daily activities (YODSMMDE=1). An adolescent was classified as NOT having a major depressive episode (MDE) in their lifetime (YMDELT=2) if they met either of these conditions: (1) Reported experiencing fewer than 5 out of the 9 criteria used to define an adolescent as having had MDE in their lifetime (YODSMMDE=2) (2) If the number of criteria used to define an adolescent as having had MDE in their lifetime is unknown (YODSMMDE=98) and reported at least one of the following: (I) Never having had a period of time lasting several days or longer when felt sad, empty, or depressed (YODPREV=2), discouraged about how things were going in life (YODSCEV=2), and lost interest in most things usually enjoyable (YOLOSEV=2). (II) Never having had a period of time when being sad, discouraged, or having lost interest in most things usually enjoyable lasting nearly every day for two weeks or longer [YOLSI2WK=2 or YODPR2WK=2]. (III) Having a period of time when being sad, discouraged, or having lost interest in most things usually enjoyable lasted nearly every day for two weeks or longer and the sadness, discouragement, or loss of interest lasted less than an hour when mood was most severe and frequent (YOWRHRS=1). (IV) Having a period of time when being sad, discouraged, or having lost interest in most things usually enjoyable lasted nearly every day for two weeks or longer and during those times when mood was most severe and frequent, emotional distress was mild (YOWRDST=1), there was never a time when emotional distress was so severe that you could not be cheered up (YOWRCHR=4), and there was never a time when your emotional distress was so severe that you could not carry out your daily activities (YOWRIMP=4). (V) Having a period of time when being sad, discouraged, or having lost interest in most things usually enjoyable lasted nearly every day for two weeks or longer and never had any other problems during those weeks, such as changes in sleep, appetite, energy, the ability to concentrate and remember, or feelings of low self-worth (YODPPROB=2).
NOTE: The variables in this section are recoded variables that were created from one or more of the edited variables from the preceding section.
- Citations exports are provided above.
Export Study-level metadata (does not include variable-level metadata)