Log In/Create Account

Analysis Options for NSDUH
Public-use and Restricted-use Data

NSDUH data are accessible in the following ways: public-use (downloadable data), SDA (?) (online analysis of public-use data), R-DAS (?) (online analysis with disclosure restrictions), and the Data Portal (?) (virtual desktop access to restricted-use microdata). The following tables are designed to help users determine which delivery system is the best match for their research based on NSDUH access, availability, and variable group differences.

 NSDUH Data Availability

  NSDUH Public-use
(downloadable data)
NSDUH SDA
(online analysis of public-use data)
NSDUH R-DAS
(online analysis with disclosure restrictions)
NSDUH Data Portal
(virtual desktop access to restricted-use microdata)
Can data be downloaded to desktop Freely available microdata to public for download to analyze using statistical software. Custom data download option, including capability to download a subset of the data. No download option. No download option.
Data Formats Available Formats Available include: SAS, SPSS, Stata, R, Tab Delimited. The primary function of the SDA is online analysis. However, the download option allows for SAS, SPSS, Stata, and Tab Delimited data formats. Analyses can only be done via the SDA based R-DAS online system. Approved applicants can receive the data as an SPSS system file, and R file (*.rda), or as an ASCII data file with SAS or Stata syntax/setup files.
Years Available 1979 -2012 (early on the study was not done annually). 1979 -2012 (early on the study was not done annually). 2002-2011 (note, individual year data is not available. Data is only available in merged blocks of  2, 4, 8, or 10 years). 2004-2012 (earlier years were not collected under CIPSEA law and cannot be made available in the Data Portal).
Types of Analysis Possible Since the data can be downloaded to a desktop, there is no limitation imposed by CBHSQ on the software or analytic method an analyst may choose. The SDA online system has options to run frequencies, crosstabulations, comparison of means, regression. Only crosstabulation analysis is available. Also, in order to provide additional variables that are usually consider confidential, individual year data is not available. Only merged blocks of  2, 4, 8, or 10 years are available for analysis (the different combinations of years of data are to accommodate researchers willing to trade of year precision for additional case counts for increased small sub-population analytic ability).  Analysis can be done via an array of statistical software programs provided within the Data Portal. (i.e., SAS, SPSS, Stata, R, MS Office, SAS callable SUDAAN). Other software can be made available if requested by the applicant and approval is given by the data owner/producer (CBHSQ/SAMHSA). Data can only be analyzed within the Data Portal's secure environment via secure remote connection from an approved workspace. 
Linking to Other Data Files Downloaded data can be linked to other data for analysis. No way to link via the online analysis system. However, downloaded data can be linked to other data for analysis. No ability to link to other data at the microdata level. Data can be linked to approved external datasets. However, any data to be linked to has to be provided by the approved research group (URL's provided for publicly available data, data sent securely if it is proprietary data of some sort). Then SAMHDA staff will place the data into the work folder for that project. Users are not able to copy anything in or out of the data portal.
Sample Size The data is a sub-sample of the full sample size. The data is a sub-sample of the full sample size. The R-DAS data is a sub-sample of the full sample size in the restricted-use file. The full sample size is available within the Data Portal as part of the restricted-use data file.
Analytic Output Availability All output produced is available for use. All output produced is available for use. Online analytic output can be copied and pasted into MS Word, PowerPoint, Excel, and other files. All analytic output is reviewed by automatic systems. Output that does not meet certain confidentiality thresholds is blocked and a message is displayed stating the reasons why the analytic results were blocked. Additionally, only weighted output rounded to one decimal place or the nearest thousand is displayed.
Output can be copied from the browser and pasted into other programs (i.e. Word, Excel, PowerPoint).
While all output can be viewed by authorized users from within the Data Portal, only output that is submitted for disclosure review, and approved, is allowed outside of the secure Data Portal environment.
All tables/text intended for publication must be written within the Data Portal and submitted for disclosure review/approval prior to becoming available outside the Data Portal.
Pair Data Analysis Availability No No No Yes, see variable group differences worksheet for more info.
Current Schedule for adding new data Annually Annually Biennially for 2-year files, less frequently for other files. Plan is to add new data annually. However, there is no set time on when the application periods will be.

 NSDUH Variable Availability

  NSDUH Public-use
(downloadable data)
NSDUH SDA
(online analysis of public-use data)
NSDUH R-DAS
(online analysis with disclosure restrictions)
NSDUH Data Portal
(virtual desktop access to restricted-use microdata)
Geographic Variables Population density variable is the only type of geographic measure in the data. Population density variable is the only type of geographic measure in the data. State-level geography is available on all files. On the 10-year file there are several sub-state level of geography. These include county (for about 300 counties) and major metropolitan areas. Many geographic variables like state, county, and metropolitan area variables are always provided.

When necessary to the approved research plan, the following variables can also be provided: zip code, sampling segment id, longitude and latitude for the centroid of the sampling segment, indicator if multiple census tracts in segment, and the majority census tract for each segment.
Pair Analysis Variables No pair analysis possible. No pair analysis possible. No pair analysis possible. Variables required for pair analysis are present. Also available within the Data Portal are instructions for creating the appropriate pair data file specific to a research plan.
Demographic Variables Many demographic variables, like age or race, have been top coded or have had categories combined as part of the confidentiality protection. See recent public-use codebook for specific variables and categories that are available. Many demographic variables, like age or race, have been top coded or have had categories combined as part of the confidentiality protection. See recent public-use codebook for specific variables and categories that are available. Most demographic variables are available and with their original categories. Some of these include: detailed age, expanded race and ethnicity, country of birth, at risk indicators for substance initiation in the US,  etc... See online codebook for specific information about the variables and categories available. All demographic variables and original categories are available. Exact birthdate variables (PXBMONTH, PXBDAY, PXBYR, BRTHDATE, BIRMONTH, BIRDAY, BIRYEAR, and EIBDATE) are only provided within the Data Portal for projects whose specific research plan necessitates their use. Visit a full MS Excel list of Data Portal NSDUH variables.

Help