skip to main content
 Substance Abuse and Mental Health Data Archive
SAMHDA home What can you do here? Data links Can't find what you're looking for?

Data Use Tutorial

Are you interested in determining the average age at which men versus women begin smoking? What if you need to know about differences in marijuana use based on age, gender, education, or race? These and countless other questions can be answered by studies in the Substance Abuse and Mental Health Data Archive (SAMHDA) data holdings.

SAMHDA is a project of the Inter-university Consortium for Political and Social Research (ICPSR), thus some of the Web links will take you to ICPSR sites.

Let's Get Started...

Overview

SAMHDA is an initiative of the Office of Applied Studies, Substance Abuse and Mental Health Services Administration (SAMHSA) of the United States Department of Health and Human Services. The goal of the archive is to provide ready access to substance abuse and mental health research data. This will increase the use of data for understanding and assessing substance abuse and mental health problems, as well as the impact of related treatment systems. The data archive is working to expand the variety of file formats in which data are available.

The SAMHDA holdings consist mainly of raw data derived from surveys and administrative records. These data were originally collected for specific research or administrative purposes. However, the data have research potential that outlives the original purposes for which they were collected. SAMHDA preserves these valuable data resources and makes them publicly available for secondary analysis. The SAMHDA data holdings contain over 100 studies and 370 downloadable data files.

The public-use files in the SAMHDA holdings may not always match the original research data. Given the sensitive nature of some data, great lengths are taken to ensure that respondent identity is protected. For this reason, variables that pose an identification (or disclosure) risk are modified, encrypted, or removed from the data file.

If you don't already know which study you are interested in, then you might want to explore one of the available search features on the SAMHDA Web site. For every study archived by SAMHDA, the Web site allows users to download the data set(s) to a personal computer for analysis using their own statistical software. Users can also view or download all available documentation files for the study. In addition, many studies are available to users for online analyses using the online Survey Documentation and Analysis system (SDA). Finally, a few SAMHDA studies have an available feature called Quick Tables, which allows users to produce analytic tables by choosing from pre-selected high-interest variables available in drop-down menus.

You will need basic statistical skills for using the SDA or Quick Tables systems and more advanced skills for running analyses with your own software. Although SAMHDA does not produce publications or reports, you can view findings that others have published by selecting the Related Literature link for any given study on the search results page or at the top of each study description. This link will take you to a list of bibliographic citations for publications based on that study.

Terms, formats, and statistical vocabulary

Before you begin working with SAMHDA data, we suggest that you familiarize yourself with the methodological terms used widely throughout this Web site. If you need help with social science terminology, or with basic computing concepts, we do strongly suggest that you consult the following sources:

Searching and finding the data you need

One way to locate a particular study archived by SAMHDA is to Browse All Studies. This feature provides a directory of all our holdings with links to a Description page for each study or study series respectively. From the description pages, you may choose to download files associated with the study, initiate an online analysis session with the Survey Documentation and Analysis System (where available), or access the links to related reports or Web sites.

Another way to find data sets or variables of interest is to use the Search Options utility on the SAMHDA front page or on the Download Data Sets page. There are three ways to search SAMHDA holdings using the Search utility:

  • The Variables search option searches question text, value labels, and variable labels for every variable in all the studies currently available on the Survey Documentation and Analysis System. This is the default setting for the search utility.
  • The Study Descriptions search option searches for keywords or phrases within the descriptions of all the studies in SAMHDA's holdings.
  • The Web Site search option searches for keywords or phrases on the SAMHDA Web site.

For additional information on finding data, we strongly encourage you to consult our How to Find Data help documentation.

Example 1 below shows the results of a variable level search on the keyword "heroin".

Studies that appear on the variable level Search Results page will include one or more of the following links:

  • The List Matching Variables link will take you to a page that lists all the variables that match the search parameters. Additional information on this feature can be found in the Office of Applied Studies, Substance Abuse and Mental Health Services Administration (SAMHSA) report titled: Finding Specific Variables in the NHSDA.
  • The Study Description link will take you to the corresponding study description (example, study number 4138), which gives a useful summary of the data contents, scope, time period, and other details that will help you to determine the relevance of a data set for your research.
  • The Online Analysis link is only present for studies that have online analysis components, and will take you to a page that lists the available online analysis components related to that particular study.
  • The Browse & Download link will take you to the corresponding browse documentation page, which allows you to view/download the documentation files (codebooks, questionnaires).

Example 1: Variable-Level Search Results Screen for Keyword "heroin"

screenshot of the search results page

Example 2 below shows the results of a study description search on the keyword "methamphetamine".

Studies that appear on the study descriptions Search Results page will include one or more of the following links:

  • The Description link will take you to the corresponding study description (example, study number 4138), which gives a useful summary of the data contents, scope, time period, study citation, and other details that will help you to determine the relevance of a data set for your research.
  • The Download link takes you to a screen that walks you through a series of steps to download the data and documentation files for the selected study.
  • The Online Analysis link is only present for studies that have online analysis components, and will take you to a page that lists the available online analysis components related to that particular study.
  • The Related Literature link will return a list of bibliographic citations for publications based upon the data.

Example 2: Study Descriptions Search Results Screen for Keyword "methamphetamine"

screenshot of the study descriptions page

Once you have found a data set you wish to further investigate, you may proceed to Accessing data.

Accessing data

Even though SAMHDA is a project of the Inter-university Consortium for Political and Social Research (ICPSR), the data access requirements and procedures differ.

Since SAMHDA is federally funded, data in the SAMHDA archive are in the public domain. Therefore, users do not need to obtain permission to access, analyze, or publish findings based on SAMHDA data. We do ask that all users of SAMHDA data sets abide by the Responsible Use Policy and properly cite the data files.

Users should note that some of the data elsewhere in the ICPSR archive, which were acquired and processed through support of the ICPSR membership, are downloadable only by individuals at ICPSR member institutions.

Downloading Data

To download the data for a study, you must first select either the download tab on the Study Description page or the Download link for a study that appears on a Search Results page. If you have not already logged in, selecting the Download link/tab will take you to a MyData login screen. You can log in as a Guest/Anonymously or as a Returning User.

Logging in anonymously as a guest is the fastest way to proceed to the download page.

To log in as a returning user, you will need to create a MyData account by registering as a user. Individuals who plan to frequently access/download data from the SAMHDA archives will want to consider creating a MyData account. This service offers users of SAMHDA data additional functionality.

Note: You must set your browser to accept cookies in order for login to succeed. Your login session will last until you close your browser or log out. If you encounter technical problems, our FAQ section provides solutions to common problems.

Once you have logged in, you will be directed to the Download page for the study.

The sample Download screen shown below in Example 3 (example of study number 4138, National Survey on Drug Use and Health - this link requires login) walks you through a series of steps to download the files you need. Step 1 lists the available data formats, while Step 2 lists the data sets in the study. Select your preferences for download, and then click the button in Step 3 to add the files to your Data Cart. You may use Step 4 to review the contents of your cart. When you are ready, simply click on the button in Step 5. When you click the Download Data Cart button, you will be redirected to a Terms of Use statement screen. Before being allowed to continue with the download, you will need to click on the I Agree button to verify that you have read and agree with the Terms of Use. Then you can download a zipped folder that contains all your files.

If you're uncertain whether the study fits your needs, we recommend first browsing the codebook or other documentation files, which can be accessed by clicking on the Browse Documentation tab.

Example 3: Download Screen

screenshot of the download page

Viewing the description file is also good practice when preparing to download data. The description file contains a general description of the study, and sometimes supplies important information about file formats and other unique characteristics of the data.

Analyzing Data Using Statistical Software

As we have mentioned above, statistical software is necessary to define, manipulate, and extract variables and cases within data files. Furthermore, interpretation and analysis of the data require at least basic statistical skills and preferably some knowledge of a statistical software package. Students of the social sciences usually acquire this knowledge during their first two years in college.

Many of the SAMHDA studies now have software-specific data files. When available, SAMHDA currently provides software-specific data files for SAS (transport files), SPSS (portable or system files), and Stata (system files). There are also SAS and Stata supplemental syntax files, which are used to apply values labels (SAS) or to apply the missing value recodes (SAS and Stata).

All SAMHDA studies include ASCII data files and setup files, which allow the user to read ASCII data files into statistical software packages. SAMHDA currently provides setup files for SAS, SPSS, and Stata, three of the more commonly used analytical software packages for the social sciences.

SAS, SPSS, and Stata setup files and software specific data files can be used either with the procedural version of these packages or with their Windows counterpart. The following instructions explain the different components of SAS, SPSS, and Stata setup files. Setup files for certain collections may not contain all of the commands listed below.

  • Examples of Setup Files
  • How to Use Setup Files to Read ASCII Data

Online Data Analysis Using SDA

In order to make its data holdings usable for a wider audience, SAMHDA has adopted a user-friendly online Survey Documentation and Analysis system (SDA). Many SAMHDA data collections are available for online data analysis. This means that users can explore the data using certain statistical procedures without downloading data files and setup files, and without being familiar with any statistical package. SDA also offers a very useful option for those familiar with a statistical package: Users can create custom subsets of cases and/or variables and download the subsets to their computers. This option is attractive when the data file is very large, but only a few variables for a certain age, sex, or other subgroup are of interest.

SDA was developed and maintained by the Computer-assisted Survey Methods Program (CSM) at the University of California, Berkeley.

The Online Survey Documentation and Analysis Tutorial is an excellent resource that takes you on a guided interactive tour of some of the most commonly used features of SDA. Another useful resource is Berkeley's Online Help Files for SDA Users. Also, many of the analytical terms and options within SDA have clickable links that will bring up online help documentation.

SAMHDA currently uses SDA Version 3.1 for all studies formatted for this analysis package. Version 3.1 enables users to:

  • Browse the SDA codebook
  • Perform certain statistical procedures, such as:
    • List values of individual cases
    • Frequencies
    • Crosstabulations
    • Comparisons of means
    • Correlation matrix
    • Multiple regression
  • Create bar, line, and pie charts for frequencies and crosstabulations
  • Manipulate variables
    • Recode variables
    • Compute new variables
    • List newly created variables
  • Create customized subsets of selected variables and cases
  • Download the subset, including:
    • ASCII data file with optional delimiters
    • SAS, SPSS, or Stata setup files
    • Codebook customized for the selected subset

For SAMHDA studies where SDA is available, there are several ways to access SDA: click on the Online Analysis link that appears on the search results page, select the Analyze and Subset tab that appears at the top of the Description page, or browse a list of data collections available for online analysis.

Note regarding studies with a complex sample design:
 
Some of the studies archived by SAMHDA used a complex sample design. For studies that used a complex sampling design, great care should be taken when evaluating the results of analyses run on SDA. This is because only some statistical output correctly takes into account the complex sample design of the study. We are working to improve future versions of SDA so all available analysis features and statistical output will correctly account for studies with a complex design. Please refer to the online documentation for the most recent information on which SDA analytic features and statistical output correctly account for complex design. To determine whether a study used a complex sample design, refer to the codebook for that study.

Producing Quick Tables

As previously mentioned, several SAMHDA studies have a feature called Quick Tables. This feature allows you to produce analytic tables by choosing from among preselected high-interest variables available in drop-down menus. Quick Tables is easier to use than other types of analysis, however, it is also more restrictive because of the limited number of variables available. Quick Tables is also powered by the Survey Documentation and Analysis (SDA) system.

For a step-by-step example of how to use Quick Tables, see the Office of Applied Studies, Substance Abuse and Mental Health Services Administration (SAMHSA) report titled: Introducing Quick Tables.

Need More Information?

If you have additional questions, please take a look at our help documentation, which features a searchable Frequently Asked Question database. SAMHDA also provides user support through email and a toll-free helpline.