In the income section, which was interviewer-administered, a split-sample study had been embedded within the 2006 and 2007 surveys to compare a shorter version of the income questions with a longer set of questions that had been used in previous surveys. This shorter version was adopted for the 2008 NSDUH and will be used for future NSDUHs.
For selected variables, statistical imputation was performed following logical inference to replace missing responses. These variables are identified in the codebook as
"...LOGICALLY ASSIGNED" for the logical procedure, or by the
designation "IMPUTATION-REVISED" in the variable label when the statistical procedure was also performed. The names of statistically imputed variables begin with the letters "IR". For each imputation-revised variable, a corresponding imputation indicator variable indicates whether a case's value on the variable resulted from an interview response or was imputed. Missing values for some demographic variables were imputed by the unweighted hot-deck
technique used in previous surveys. Beginning in 1999, imputation of missing values for most variables was accomplished using predictive mean neighborhoods (PMN), a new procedure developed specifically for this survey. Both the hot-deck and PMN imputation procedures are described in the codebook.
Although the design of the 2011 survey is similar to the design of the 1999 through 2001 surveys, there are important methodological differences since 2002 that affect the 2011 estimates. Each NSDUH respondent since 2002 has been given an incentive payment of $30. This change resulted in an improvement in the survey response rate. In addition, in 2002 and 2011 new population data from the 2000 and 2010 decennial Censuses, respectively, became available for use in NSDUH sample weighting procedures. Therefore the data from 2002 and later should not be compared with data collected in 2001 or earlier to assess changes over time.
Since 1999, the survey sample has employed a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia.
The setup and dictionary files for Stata are designed to be compatible with StataSE, Version 8. This is a large data file requiring that approximately 400 megabytes of Random Access Memory be allocated to Stata. Operations within Stata,
including conversion of the ASCII data to Stata format, are likely to be slow. Analysts may wish to download subsets of data from the SAMHDA Survey Documentation and Analysis (SDA) system for use with Stata.
Prior to the 2002 survey, this series was titled National
Household Surveys on Drug Abuse.
Data were collected and prepared for release by
Research Triangle Institute, Research Triangle Park, North Carolina.
To protect the privacy of respondents, all variables that could be used to identify individuals have been encrypted or collapsed in the public use file. To further ensure respondent confidentiality, the data producer used data substitution and deletion of state identifiers and a subsample of records in the creation of the public use file.
Previously published estimates may not be exactly reproducible from the variables in the public use file due to the disclosure protection procedures that were implemented.