Search results

Showing 1 – 50 of 59 results.
Curated
Simple Crosstabs

Midlife in the United States (MIDUS Refresher 1), 2011-2014 (ICPSR 36532)

Released/updated on: 2025-09-17
Geographic coverage: United States
Time period: 2008-01-01--2009-01-01, 2011-01-01--2014-01-01

In 2011-2014, the MIDUS Refresher study recruited a national probability sample of 3,577 adults, aged 25 to 74, designed to replenish the original MIDUS 1 baseline cohort and paralleling the five decadal age groups of the MIDUS 1 baseline survey [ICPSR 2760]. The MIDUS Refresher survey employed the same comprehensive assessments as those assembled on the existing MIDUS sample, but with additional questions about the effect of the economic recession of 2008-09.

The MIDUS Refresher collection is split into two datasets: Aggregate Data and Coded Text Data. The Coded Text Dataset provides coded responses to open-ended question items in the Aggregate Dataset. The survey data collection (Project 1) [MIDUS, ICPSR 2760] consisted of a 30-minute phone interview followed by two 50-page mailed self-administered questionnaires. Survey data were collected on demographic, psycho-social, and physical and mental health information. This new crosssectional MIDUS sample allows the examination of period effects on health (mental and physical) related to the economic recession by comparing the pre-recession MIDUS 1 sample with the post-recession MIDUS Refresher sample. A further objective of the MIDUS Refresher sample was to strengthen cross-project analyses in MIDUS by increasing the sample sizes available for testing hypotheses dealing with the interplay of key factors (e.g., socioeconomic status, gender, psychosocial factors, biological factors) in mid- and laterlife health. To that end, the MIDUS Refresher sample followed the same multi-disciplinary protocol established in the main MIDUS sample, in that after completing the survey protocol respondents were asked to complete a cognitive assessment by phone (Project 3) [MIDUS 3, ICPSR 36346] and later became eligible to participate in daily diary assessments (Project 2) [MIDUS 2, ICPSR 4652] biomarker assessments (Project 4) [MIDUS 2: Biomarker Project, ICPSR 29282] and neuroscience assessments (Project 5) [MIDUS 2: Neuroscience Project, ICPSR 28683].

The MIDUS Refresher was funded by the National Institute on Aging as two separate but related efforts: The MIDUS Refresher younger decades (MRY), was fielded in November, 2011, and recruited over 2,100 new participants aged 25 to 54; Funding was later added for the MIDUS Refresher older decades (MRO), which was fielded in June, 2013 and recruited over 1,400 new participants aged 55 to 74.

Demographic variables include age, sex, gender, race, religion, and marital status.

Curated
Simple Crosstabs

Midlife in the United States (MIDUS 3), 2013-2014 (ICPSR 36346)

Released/updated on: 2019-04-30
Geographic coverage: Contiguous United States
Time period: 2013-05-01--2014-11-01

In 1995-1996, the MacArthur Midlife Research Network carried out a national survey of over 7,000 Americans aged 25 to 74 [ICPSR 2760]. The purpose of the study was to investigate the role of behavioral, psychological, and social factors in understanding age-related differences in physical and mental health. The study was innovative for its broad scientific scope, its diverse samples (which included siblings of the main sample respondents and a national sample of twin pairs), and its creative use of in-depth assessments in key areas (e.g. daily diary of stressful experiences [ICPSR 3725] and cognitive functioning [ICPSR 3596]) on a subset of participants. A detailed description of the study and findings generated by it are available at: http://www.midus.wisc.edu

With support from the National Institute on Aging, a follow-up of the original Midlife Development in the United States (MIDUS) sample was conducted in 2004 (MIDUS 2 [ICPSR 4652]). The daily stress and cognitive functioning projects were repeated and expanded at MIDUS 2; in addition the protocol was expanded to include biomarkers and neuroscience.

In 2013 a third wave (MIDUS 3) of survey data was collected on longitudinal participants. Data collection for this follow-up wave largely repeated baseline assessments (e.g., phone interview and extensive self-administered questionnaire), with additional questions in selected areas such as economic recession experiences. Cognitive functioning data were also collected at the same time, while data collection for the daily diary, biomarker, and neuroscience projects commenced in 2017.

MIDUS also maintains a Colectica portal, which allows users to interact with variables across waves and create customized subsets. Registration is required.

Curated
Restricted

Midlife in the United States (MIDUS 2): Milwaukee African American Sample, 2005-2006 (ICPSR 22840)

Released/updated on: 2024-02-26
Geographic coverage: Milwaukee, United States, Wisconsin
Time period: 2005-01-01--2006-01-01
As a refinement to Midlife in the United States (MIDUS 2), 2004-2006 (ICPSR 4652), a sample of African Americans from Milwaukee was included to examine health issues in minority populations. Areas of the city of Milwaukee, Wisconsin, were stratified according to the proportion of the population that were African American. Those areas with high concentrations were sampled at higher rates than areas with lower concentrations. Area probability sampling methods were used along with population counts from the 2000 United States Census to identify potential respondents. Field interviewers screened households to determine if they contained any African American adults. There was additional screening to achieve an appropriate age/gender distribution in a manner similar to what was done for the original MIDUS sample Midlife in the United States (MIDUS 1), 1995-1996 (ICPSR 2760). Milwaukee respondents were interviewed in their homes using a Computer Assisted Personal Interview (CAPI) protocol and afterwards asked to complete a Self-Administered Questionnaire (SAQ). All measures paralleled those used in the larger MIDUS 1 and 2 samples. After successful completion of the Project 1 survey, some participants were eligible to participate in other MIDUS projects (2 through 5). Survey data was collected for 592 individuals.
Curated
Simple Crosstabs

Midlife in the United States (MIDUS 1), 1995-1996 (ICPSR 2760)

Released/updated on: 2020-09-28
Geographic coverage: United States
Time period: 1995-01-01--1996-01-01

The Midlife in the United States (MIDUS) is a collaborative, interdisciplinary investigation of patterns, predictors, and consequences of midlife development in the areas of physical health, psychological well-being, and social responsibility. A description of the study and findings from it are available at http://www.midus.wisc.edu.

The first wave of the MIDUS study (MIDUS 1 or M1) collected survey data from a total of 7,108 participants. The baseline sample was comprised of individuals from four subsamples: (1) a national RDD (random digit dialing) sample (n=3,487); (2) oversamples from five metropolitan areas in the U.S. (n=757); (3) siblings of individuals from the RDD sample (n=950); and (4) a national RDD sample of twin pairs (n=1,914). All eligible participants were non-institutionalized, English-speaking adults in the coterminous United States, aged 25 to 74.

Data from the samples were collected primarily in 1995/96. The survey (Project 1) dataset contains responses from a 30-minute Phone interview and two 50-page Self-Administered Questionnaire (SAQ) instruments. Of the 7,108 respondents who completed the Phone interview, 6,325 also completed the SAQ.

This updated version of the study is comprised of three primary datasets:

Dataset 1, Main, Siblings, and Twin Data, contains responses from the main survey of 7,108 respondents. Respondents were asked to provide extensive information on their physical and mental health throughout their adult lives, and to assess the ways in which their lifestyles, including relationships and work-related demands, contributed to the conditions experienced. Those queried were asked to describe their histories of physical ailments, including heart-related conditions and cancer, as well as the treatment and/or lifestyle changes they went through as a result. A series of questions addressed alcohol, tobacco, and illegal drug use, and focused on history of use, regularity of use, attempts to quit, and how the use of those substances affected respondents' physical and mental well-being. Additional questions addressed respondents' sense of control over their health, their awareness of changes in their medical conditions, commitment to regular exercise and a healthy diet, experience with menopause, the decision-making process used to deal with health concerns, experiences with nontraditional remedies or therapies, and history of attending support groups. Respondents were asked to compare their overall well-being with that of their peers and to describe social, physical, and emotional characteristics typical of adults in their 20's, 40's, and 60's. Information on the work histories of respondents and their significant others was also elicited, with items covering the nature of their occupations, work-related physical and emotional demands, and how their personal health had correlated to their jobs. An additional series of questions focusing on childhood queried respondents regarding the presence/absence of their parents, religion, rules/punishments, love/affection, physical/verbal abuse, and the quality of their relationships with their parents and siblings. Respondents were also asked to consider their personal feelings of accomplishment, desire to learn, sense of control over their lives, interests, and hopes for the future.

The Datasets previously numbered 2 and 3 have been removed to avoid redundancies, and all datasets have been renumbered. Please refer to the readme file.

Dataset 2, Twin Screener Data, provides the first national sample of twin pairs ascertained randomly via the telephone.

Dataset 3, Coded Text Responses, describes how open-ended textual responses in the MIDUS 1 Computer-Assisted Telephone Interview (CATI) and Self-Administered Questionnaire (SAQ) were transformed into categorical numeric codes. These codes are included in a stand-alone dataset containing only those cases (N=3,950) that contained text data in their responses.

Online Analysis Only: Datasets 1, 2, and 3 were merged together by the SU_ID variable to form "Merged Data with Weights (Online Analysis Only)" (Dataset 4) for online analysis capabilities.

MIDUS also maintains a Colectica portal, which allows users to interact with variables across waves and create customized subsets. Registration is required.

Curated
Simple Crosstabs

Midlife in the United States (MIDUS 2), 2004-2006 (ICPSR 4652)

Released/updated on: 2021-09-15
Geographic coverage: United States
Time period: 2004-01-01--2006-01-01

In 1995-1996, the MacArthur Midlife Research Network carried out a national survey of 7,108 Americans aged 25 to 74 (MIDLIFE IN THE UNITED STATES (MIDUS), 1995-1996 [ICPSR 2760]). The purpose of the study was to investigate the role of behavioral, psychological, and social factors in understanding age-related differences in physical and mental health. The study was innovative for its broad scientific scope, its diverse samples (which included twins and the siblings of main sample respondents), and its creative use of in-depth assessments in key areas (e.g., daily stress and cognitive functioning). A description of the study and findings from it are available at http://www.midus.wisc.edu. With support from the National Institute on Aging, a longitudinal follow-up of the original MIDUS samples: core sample (N = 3,487), metropolitan over-samples (N = 757), twins (N = 925 complete pairs), and siblings (N = 950), was conducted in 2004-2006. Guiding hypotheses for it, at the most general level, were that behavioral and psychosocial factors are consequential for physical and mental health. MIDUS 2 respondents were aged 35 to 86. Data collection largely repeated baseline assessments (e.g., phone interview and extensive self-administered questionnaire), with additional questions in selected areas (e.g., cognitive functioning, optimism and coping, stressful life events, and caregiving). To add refinements to MIDUS 2, an African American sample (N = 592) was recruited from Milwaukee, Wisconsin, who participated in a personal interview and completed a questionnaire paralleling the above assessments. Survey data for the Milwaukee sample are available in a separate project [ICPSR 22840]. Also administered was a modified form of the mail questionnaire, via telephone, to respondents who did not complete a self-administered questionnaire.

Curated

Database of Genotypes and Phenotypes (dbGaP) (ICPSR 34520)

Released/updated on: 2013-01-24

The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.

dbGaP provides two levels of access - open and controlled - in order to allow broad release of non-sensitive data, while providing oversight and investigator accountability for sensitive data sets involving personal health information. Summaries of studies and the contents of measured variables as well as original study document text are generally available to the public, while access to individual-level data including phenotypic data tables and genotypes require varying levels of authorization.

Curated
Restricted

Technology, Teen Dating Violence and Abuse, and Bullying in Three States, 2011-2012 (ICPSR 34741)

Released/updated on: 2016-02-15
Geographic coverage: United States, New York (state), New Jersey, Pennsylvania
Time period: 2011-01-01--2012-01-01

This project examined the role of technology use in teen dating violence and abuse, and bullying. The goal of the project was to expand knowledge about the types of abuse experiences youth have, the extent of victimization and perpetration via technology and new media (e.g., social networking sites, texting on cellular phones), and how the experience of such cyber abuse within teen dating relationships or through bullying relates to other life factors.

This project carried out a multi-state study of teen dating violence and abuse, and bullying, the main component of which included a survey of youth from ten schools in five school districts in New Jersey, New York, and Pennsylvania, gathering information from 5,647 youth about their experiences. The study employed a cross-sectional, survey research design, collecting data via a paper-pencil survey. The survey targeted all youth who attended school on a single day and achieved an 84 percent response rate.

Curated
Restricted

Survey of Inmates in State and Federal Correctional Facilities, 1997 (ICPSR 2598)

Released/updated on: 2006-03-30
Geographic coverage: United States
Conducted by the Bureau of the Census, this survey provides nationally representative data on state prison inmates and sentenced federal inmates held in federally owned and operated facilities. Through personal interviews from June-October 1997, inmates in both state and federal prisons provided information about their current offense and sentence, criminal history, family background and personal characteristics, prior drug and alcohol use and treatment programs, gun possession and use, gang membership, and prison activities, programs, and services. Prior surveys of state prison inmates, called SURVEY OF INMATES OF STATE CORRECTIONAL FACILITIES, were conducted in 1974, 1979, 1986, and 1991 (see ICPSR 7811, 7856, 8711, and 6086). Sentenced federal prison inmates were first interviewed in 1991 (see SURVEY OF INMATES OF FEDERAL CORRECTIONAL FACILITIES, 1991 [ICPSR 6037]). The federal data are combined with the state data in this collection. Part 1, Numeric Data, consists of numerically-coded responses, while Part 2, Alphanumeric Data, contains free-field responses to "Specify, Other" questions in ASCII text form.
Curated
Restricted

CARE Corrections: Technology for Jail HIV/HCV Testing, Linkage, and Care (TLC), Washington, DC, 2012-2014 (ICPSR 39784)

Released/updated on: 2026-04-22
Geographic coverage: District of Columbia, United Kingdom
Time period: 2012-01-01--2014-01-01

This study is part of the Seek, Test, Treat and Retain (STTR) Collaboration Project that involved over twenty studies in the fields of HIV and drug abuse. All studies were independently developed, but were chosen for the collaboration because they focused on one or more steps of the HIV treatment cascade: Seek, Test, Treat and Retain. As part of STTR Collaboration Project, the studies were grouped into Criminal Justice-related studies and Vulnerable Population-related studies. The data collected by these studies included twelve common domains (e.g., Demographic characteristics, Mental Health) in each of which a shared questionnaire or instrument was taken up by the studies and adapted to fit the study.

The main project of the CARE+ Corrections study is in Washington DC and is a RCT evaluating the "CARE+ Corrections intervention (a computerized tool integrating HIV treatment counseling, secondary transmission risk reduction counseling, and facilitated linkage to care through text message reminders)" versus standard of care among returning citizens in Washington, DC. The study is recruiting 100 participants who are incarcerated or were released from a correctional facility less than 6 months ago.

Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2023 (ICPSR 39171)

Released/updated on: 2024-10-31
Geographic coverage: United States
Time period: 2023-01-01--2023-12-31

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and substance use. There are more than 450 variables across the questionnaires. Substance use covered by this survey includes: amphetamines (stimulants), sedatives/barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, vaping, tobacco, smokeless tobacco, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2023:

  • Change to the question stem for some lifetime, 12 month, and 30 day marijuana use questions: Please see the Highlights for 2023 section in the codebook for more details.
  • Additional information is documented in the doc39171-0001_MTFQChanges2023byForm.pdf and doc39171-0001_MTFQChanges2023byType.pdf files available for download.
  • The ICPSR-generated codebooks contain only the frequencies, question text, and response options for the survey items. Please see 39171-0001-User_guide-UsersGuide.pdf for the annual study documentation provided by MTF.
Curated

National Health Interview Survey, 2002 (ICPSR 4176)

Released/updated on: 2011-03-23
Geographic coverage: United States
The purpose of the National Health Interview Survey (NHIS) is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Implementation of a redesigned NHIS, consisting of a basic module, a periodic module, and a topical module, began in 1997 (See NATIONAL HEALTH INTERVIEW SURVEY, 1997 [ICPSR 2954]). The 2002 NHIS contains the Household, Family, Person, Sample Adult, Sample Child, Child Immunization, and Injury and Poison Episode data files from the basic module. Each record in the Household-Level File (Part 1) contains data on type of living quarters, number of families in the household responding and not responding, and the month and year of the interview for each sampling unit. The Family-Level File (Part 2) is made up of reconstructed variables from the person-level data of the basic module and includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, and employment status, along with industry and occupation. As part of the basic module, the Person-Level File (Part 3) provides information on all family members with respect to health status, limitation of daily activities, cognitive impairment, and health conditions. Also included are data on years at current residence, region variables, height, weight, bed days, doctor visits, hospital stays, and health care access and utilization. A randomly-selected adult in each family was interviewed for the Sample Adult File (Part 4) regarding respiratory conditions, renal conditions, AIDS, joint symptoms, health status, limitation of daily activities, and behaviors such as smoking, alcohol consumption, and physical activity. Also included in this file are variables pertaining to the Healthy People 2010 Objectives. The Sample Child File (Part 5) provides information from an adult in the household on medical conditions of one child in the household, such as respiratory problems, seizures, allergies, and use of special equipment such as hearing aids, braces, or wheelchairs. Also included are variables regarding child behavior, the use of mental health services, and Attention Deficit Hyperactivity Disorder (ADHD). The Child Immunization File (Part 6) presents information from shot records on vaccination status, number and dates of shots, and information about the chicken pox vaccine. Episode-based information regarding injuries and poisonings is found in the Injury and Poison Episode File (Part 7), which examines the cause and date of injury or poisoning, loss of time from work or school, and whether the episode resulted in hospitalization. Information in the Injury and Poison Verbatim File (Part 8) is comprised of narrative text describing injuries, including type of injury, how the injury occurred, and the body part injured. The Alternative Health Supplement (Part 9) collected information from sample adults on their use of 17 nonconventional health care practices: acupuncture, ayurveda, biofeedback, chelation therapy, chiropractic care, energy healing therapy/Reiki, folk medicine, hypnosis, massage, naturopathy, natural herbs, homeopathic treatment, special diets, high dose or megavitamin therapy, yoga/tai chi/qi gong, relaxation techniques, and prayer and spiritual healing. The Alternative Health Verbatim File (Part 10) contains the narrative text regarding the use of nontraditional health care practices.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2024 (ICPSR 39445)

Released/updated on: 2025-10-30
Geographic coverage: United States
Time period: 2024-01-01--2024-12-31

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and substance use. There are more than 450 variables across the questionnaires. Substance use covered by this survey includes: amphetamines (stimulants), sedatives/barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, vaping, tobacco, smokeless tobacco, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2024:

  • The MTF sampling procedure was updated in 2024. Please see the 2024 MTF annual report for details. Variable-specific details are found in the user's guide that accompanies this study.
  • Changes were made to the question stems for many of the substance use "triplets", i.e. lifetime, 12-month, and 30-day timeframes, including: marijuana/cannabis, hallucinogens other than LSD, amphetamines (stimulants), sedatives/barbiturates, tranquilizers, and narcotics other than heroin. Additional information about question text and response option changes, along with details about added and dropped questions, are documented in the MTFQchanges2024byForm.pdf and MTFQchanges2024byType.pdf files available for download.
  • MTF is no longer providing dichotomized substance use variables on the DS1 datasets. As each researcher has their own method of working with data, it is up to the researcher to create these variables for their specific needs.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2023 (ICPSR 39172)

Released/updated on: 2024-10-31
Geographic coverage: United States

This survey of 12th-grade students is part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and substance use. There are about 1,400 variables across the questionnaires. Substance use covered by this survey includes: tobacco, smokeless tobacco, alcohol, marijuana, hashish, vaping, prescription medications, over-the-counter medications, LSD, hallucinogens, amphetamines (stimulants), Ritalin (methylphenidate), sedatives/barbiturates (tranquilizers), cocaine, crack cocaine, GHB (gamma hydroxy butyrate), ecstasy, methamphetamine, and heroin. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2023:

  • 12th grade only: Continuation of randomized blocks of questions presented to students. Please see Appendix D of the codebook.
  • All grades: Change to the question stem for some lifetime, 12 month, and 30 day marijuana use questions.
  • Separate codebooks are generated by ICPSR for the core data file (DS1) and the six form-specific data files (DS2-DS7). The codebooks contain only the frequencies, question text, and response options for the survey items. Please see the documentation under DS0 Study-Level Files for the annual study documentation provided by MTF, 39172-0001-User_guide-UsersGuide.pdf.

Please see the Highlights for 2023 section in the codebook for more details.

Additional information is documented in the MTFQchanges2023byForm.pdf and MTFQchanges2023byType.pdf files available for download.

Curated
Simple Crosstabs

Perception and Memory Experiments Using Drug Names [2010, Canada] (ICPSR 34122)

Released/updated on: 2013-04-30
Geographic coverage: Canada, Ontario, Global
Time period: 2012-03-28--2012-03-29, 2012-07-05--2012-07-06
Drug names that look and sound alike are a leading cause of medication errors (e.g., diazepam and diltiazem, hydroxyzine and hydralazine, Paxil and Taxol, fomepizole and omeprazole, Foradil and Toradol). Observational studies of dispensing in outpatient pharmacies suggest that the rate of wrong drug errors -- the type most likely to be the result of name confusion -- is roughly 0.13 percent. With 3.9 billion prescriptions dispensed in 2009, that translates to 5 million wrong drug errors per year in the United States. The purpose of this overall project was to develop, demonstrate, and disseminate a standard protocol for pre-approval testing of drug names, including a standard battery of psycholinguistic tests and data analytic methods, all with comparison to control names and to refine and demonstrate analytic methods by conducting a series of visual perception, auditory perception, and short term memory experiments using drug names as stimuli. The achievement of this aim will provide both regulators and pharmaceutical manufacturers with a scientifically validated, step-by-step method for testing new drug names for confusability. The data for this collection come from four experiments. In each experiment, participants are tested on their ability to correctly identify drug names under four conditions (see study design). Variables include participant reaction time to identify drug names and the percent participants correctly or incorrectly identified drug names. Study participants include medical doctors, nurse practitioners, pharmacists, and pharmacy technicians. Other variables include participant gender, education degree held, primary language spoken, and employment location.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2024 (ICPSR 39444)

Released/updated on: 2025-10-30
Geographic coverage: United States

This survey of 12th-grade students is part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and substance use. There are about 1,400 variables across the six questionnaires. Substance use covered by these surveys includes tobacco, smokeless tobacco, alcohol, marijuana/cannabis, prescription medications, over-the-counter medications, inhalants, steroids, LSD and other hallucinogens, amphetamines (stimulants), sedatives/barbiturates, tranquilizers, cocaine, crack cocaine, ecstasy, methamphetamine, heroin, narcotics other than heroin, and vaping of nicotine, marijuana/cannabis, and flavors. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2024:

  • The MTF sampling procedure was updated in 2024. Please see the 2024 MTF annual report for details. Variable-specific details are found in the user's guide that accompanies this study.
  • Continuation of randomized blocks of questions presented to students. Please see Appendix D of the user's guide.
  • In 2023, the question about use of Delta-8 THC was included only on forms 3 and 6. In 2024, this question is now included on all survey forms. With the inclusion on all forms, please note these variable name changes:
    • CORE: V2934 was changed to V7976
    • Form 3: V3660 was changed to V7976
    • Form 6: V6676 was changed to V7976
  • Changes were made to the question stems for many of the substance use "triplets", i.e. lifetime, 12 month, and 30 day timeframes, including: marijuana/cannabis, hallucinogens other than LSD, amphetamines (stimulants), sedatives/barbiturates, tranquilizers, and narcotics other than heroin.
  • Additional information about question text and response option changes, along with details about added and dropped questions, are documented in the MTFQchanges2024byForm.pdf and MTFQchanges2024byType.pdf files available for download.
  • MTF is no longer providing dichotomized substance use variables on the DS1 datasets. As each researcher has their own method of working with data, it is up to the researcher to create these variables for their specific needs.
Curated

National Health Interview Survey, 1999 (ICPSR 3397)

Released/updated on: 2006-03-30
Geographic coverage: United States
The purpose of the National Health Interview Survey (NHIS) is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Implementation of a redesigned NHIS, consisting of a basic module, a periodic module, and a topical module, began in 1997 (see NATIONAL HEALTH INTERVIEW SURVEY, 1997 [ICPSR 2954]). The 1999 NHIS contains the household, family, person, sample adult, sample child, and immunization data files from the basic module. Included in the 1999 NHIS are periodic questions that provide additional detail on topics such as Adult Conditions (ACN), Adult Access and Utilization (AAU), Child Conditions, Limitation of Activity and Health Status (CHS), and Child Access and Utilization (CAU). Each record in the Household-Level File (Part 1) of the basic module contains data on the type of living quarters, number of families in the household responding and not responding, and the month and year of the interview for each sampling unit. The Family-Level File (Part 2) is made up of reconstructed variables from the person-level data of the basic module and includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, and employment status, along with industry and occupation. As part of the basic module, the Person-Level File (Part 3) provides information on all family members with respect to health status, limitation of daily activities, cognitive impairment, and health conditions. Also included are data on years at current residence, region variables, height, weight, bed days, doctor visits, hospital stays, and health care access and utilization. A randomly-selected adult in each family was interviewed for the Sample Adult File (Part 4) regarding respiratory conditions, renal conditions, AIDS, joint symptoms, health status, limitation of daily activities, and behaviors such as smoking, alcohol consumption, and physical activity. The Sample Child File (Part 5) provides information from a knowledgeable adult in the household on medical conditions of one child in the household, such as respiratory problems, seizures, allergies, and use of special equipment such as hearing aids, braces, or wheelchairs. Also included are questions regarding child behavior, the use of mental health services, and Attention Deficit Hyperactivity Disorder (ADHD). The Child Immunization File (Part 6) presents information from shot records and supplies vaccination status, along with the number and dates of shots, and information about the chicken pox vaccine. Episode-based information is found in the Injury Episode File (Part 7), while information in the Injury Verbatim File (Part 8) is comprised of narrative text describing injuries, including type of injury, how the injury occurred, and the body part injured. The Poison Episode File (Part 9) examines the cause and date of injury or poisoning, loss of time from work or school, and whether the poisoning resulted in hospitalization.
Curated

National Health Interview Survey, 2001 (ICPSR 3605)

Released/updated on: 2005-11-04
Geographic coverage: United States

The purpose of the National Health Interview Survey (NHIS) is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Implementation of a redesigned NHIS, consisting of a basic module, a periodic module, and a topical module, began in 1997 (See NATIONAL HEALTH INTERVIEW SURVEY, 1997 [ICPSR 2954]).

The 2001 NHIS contains the Household, Family, Person, Sample Adult, Sample Child, Child Immunization, and Injury and Poison Episode data files from the basic module. Each record in the Household-Level File (Part 1) contains data on type of living quarters, number of families in the household responding and not responding, and the month and year of the interview for each sampling unit.

The Family-Level File (Part 2) is made up of reconstructed variables from the person-level data of the basic module and includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, and employment status, along with industry and occupation.

As part of the basic module, the Person-Level File (Part 3) provides information on all family members with respect to health status, limitation of daily activities, cognitive impairment, and health conditions. Also included are data on years at current residence, region variables, height, weight, bed days, doctor visits, hospital stays, and health care access and utilization.

A randomly-selected adult in each family was interviewed for the Sample Adult File (Part 4) regarding respiratory conditions, renal conditions, AIDS, joint symptoms, health status, limitation of daily activities, and behaviors such as smoking, alcohol consumption, and physical activity. Also included in this file are variables pertaining to the Healthy People 2010 Objectives.

The Sample Child File (Part 5) provides information from an adult in the household on medical conditions of one child in the household, such as respiratory problems, seizures, allergies, and use of special equipment such as hearing aids, braces, or wheelchairs. Also included are variables regarding child behavior, the use of mental health services, and Attention Deficit Hyperactivity Disorder (ADHD).

The Child Immunization File (Part 6) presents information from shot records and supplies vaccination status, along with the number and dates of shots, and information about the chicken pox vaccine.

Episode-based information regarding injuries and poisonings are found in the Injury and Poison Episode File (Part 7), which examines the cause and date of injury or poisoning, loss of time from work or school, and whether the episode resulted in hospitalization.

Information in the Injury and Poison Verbatim File (Part 8) is comprised of narrative text describing injuries, including type of injury, how the injury occurred, and the body part injured.

Curated

National Health Interview Survey, 1998 (ICPSR 3107)

Released/updated on: 2006-01-12
Geographic coverage: United States
The purpose of the National Health Interview Survey (NHIS) is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Implementation of a redesigned NHIS (ICPSR 2954), consisting of a basic module, a periodic module, and a topical module, began in 1997. The present collection consists of the basic module and topical modules on prevention, which contain pregnancy and smoking components along with information on prevention of illness and injury for adults and children. Each record in the Household-Level File (Part 1) of the basic module contains data on the type of living quarters, number of families in the household responding and not responding, and the month and year of the interview for each eligible sampling unit. The Family-Level File (Part 2) is made up of reconstructed variables from the person-level data of the basic module and includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, and employment status, along with industry and occupation. As part of the basic module, the Person-Level File (Part 3) provides information on all family members with respect to health status, limitation of daily activities, cognitive impairment, and health conditions. Also included are data on years at current residence, region variables, height, weight, bed days, doctor visits, hospital stays, and health care access and utilization. A randomly-selected adult in each family was interviewed for the Sample Adult File (Part 4) regarding respiratory conditions, renal conditions, AIDS, joint symptoms, health status, limitation of daily activities, and behaviors such as smoking, alcohol consumption, and physical activity. The Sample Child File (Part 5) provides information from a knowledgeable adult in the household on medical conditions of one child in the household, such as respiratory problems, seizures, allergies, and use of special equipment such as hearing aids, braces, or wheelchairs. Also included are questions regarding child behavior and the use of mental health services. The Child Immunization File (Part 6) presents information from shot records and supplies vaccination status, along with the number and dates of shots, and information about the chicken pox vaccine. Episode-based information is found in the Injury Episode File (Part 7), while information in the Injury Verbatim File (Part 8) is comprised of narrative text describing injuries, including type of injury, how the injury occurred, and the body part injured. The Poison Episode File (Part 9) examines the cause and date of injury or poisoning, loss of time from work or school, and whether the poisoning resulted in hospitalization. The prevention modules are being examined to determine the "Healthy People Objectives for 2010," which have the aim of reducing or preventing illness and disease among Americans. The Pregnancy and Smoking Prevention Module (Part 10) contains a record for every woman 18-49 years of age and provides information on tobacco use and smoking during pregnancy. The Sample Adult Prevention Module (Part 11) examines injury prevention, environmental health issues, tobacco use, nutrition, workplace health promotion, heart disease, stroke, chronic diseases, clinical services used, preventive services used, cancer, oral health, physical activity, mental health, family discussions, and firearm safety. The Sample Child Prevention Module (Part 12) provides information on health conditions, dental care, and injury prevention, along with use of seat belts and safety equipment during participation in sports.
Curated
Restricted

Seek, Test, Treat and Retain Strategies Leveraging Mobile Health Technologies (Connect4Care), San Francisco, California, 2013-2015 (ICPSR 39783)

Released/updated on: 2026-04-20
Geographic coverage: San Francisco, United States, California
Time period: 2013-08-01--2015-11-01

This study is part of the Seek, Test, Treat and Retain (STTR) Collaboration Project that involved over twenty studies in the fields of HIV and drug abuse. All studies were independently developed, but were chosen for the collaboration because they focused on one or more steps of the HIV treatment cascade: Seek, Test, Treat and Retain. As part of STTR Collaboration Project, the studies were grouped into Criminal Justice-related studies and Vulnerable Population-related studies. The data collected by these studies included twelve common domains (e.g., Demographic characteristics, Mental Health) in each of which a shared questionnaire or instrument was taken up by the studies and adapted to fit the study.

Connect4Care (C4C) was a single site, randomized year-long study of Short Message Service (SMS) primary care appointment reminders vs. SMS primary care appointment reminders plus thrice-weekly supportive, informational, and motivational SMS messages. Eligible consenting patients were allocated 1:1 to the two arms within strata defined by HIV diagnosis within the past 12 months (i.e. "newly diagnosed") vs. earlier.

Curated
Restricted

National Survey of Alcohol, Drug, and Mental Health Problems [Healthcare for Communities], 1997-1998 (ICPSR 3025)

Released/updated on: 2006-03-30
Geographic coverage: United States
Time period: 1997-01-01--1998-01-01
This survey is a component of the Robert Wood Johnson Foundation's Health Tracking Initiative, a program designed to monitor changes within the health care system and their effects on people. Focusing on care and treatment for alcohol, drug, and mental health conditions, the survey reinterviewed respondents to the 1996-1997 CTS Household Survey (COMMUNITY TRACKING STUDY HOUSEHOLD SURVEY, 1996-1997, AND FOLLOWBACK SURVEY, 1997-1998: [UNITED STATES] [ICPSR 2524]). Topics covered by the questionnaire include (1) demographics, (2) health and daily activities, (3) mental health, (4) alcohol and illicit drug use, (5) use of medications, (6) health insurance coverage including coverage for mental health, (7) access, utilization, and quality of behavioral health care, (8) work, income, and wealth, and (9) life difficulties. Five imputed versions of the data are included in the collection for analysis with multiple imputation techniques.
Curated
Restricted

Risk Factors for AIDS Among Intravenous Drug Users Study, New York City, 1991-1995 [Restricted] (ICPSR 35078)

Released/updated on: 2015-02-24
Geographic coverage: New York City, United States
Time period: 1991-01-01--1995-01-01

The Risk Factors for AIDS among Intravenous Drug Users study is an ongoing series of cross-sectional studies that recruits participants from a storefront research site and from one of New York City's largest detoxification facilities. The goal of the study was to assess the potential effectiveness of HIV interventions by examining participants' drug use, risk behavior, and AIDS prevention knowledge and activities.

The dataset combines survey responses taken from interviews conducted at the Bellevue Methadone Maintenance Treatment Program, the Beth Israel Medical Center and from a high drug use area in Lower East Side of Manhattan. All participants were at least 18 years of age or older. Participants from the Beth Israel Medical Center and the Lower East Side were given face-to-face interviews based on a World Health Organization Multi-Centre questionnaire. Data from the Bellevue Methadone Maintenance Treatment Program were extracted from patients' clinical files. Minimal demographic and HIV risk behavior were included in the methadone patient responses in these data to protect their anonymity. Blood samples were taken from participants to test for HIV.

These data also contain information on topics including participant demographics, alcohol use, drug use, substance abuse treatment, needle sharing habits, sexual behavior, social networks, HIV testing services, as well as mental and physical health. Drugs use explored in this study includes heroin, cocaine, crack, methadone, amphetamines, ice, tranquilizers, barbiturates and other drugs.

There are 2,907 respondents and 906 variables in the dataset.

Curated

National Health Interview Survey, 2000 (ICPSR 3381)

Released/updated on: 2006-03-30
Geographic coverage: United States
The purpose of the National Health Interview Survey (NHIS) is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. Implementation of a redesigned NHIS, consisting of a basic module, a periodic module, and a topical module, began in 1997 (See NATIONAL HEALTH INTERVIEW SURVEY, 1997 [ICPSR 2954]). This final release of the 2000 NHIS contains the Household, Family, Person, Sample Adult, Sample Child, and Immunization, and Injury and Poison data files from the basic module. The 2000 NHIS also contains the Cancer Control Module (included in the Sample Adult File, Part 4), which corresponds to the Cancer Supplements of 1987 and 1992 and examines such items as diet and nutrition, use of herbal supplements, Hispanic acculturation, genetic testing, and family history. Each record in the Household-Level File (Part 1) of the basic module contains data on the type of living quarters, number of families in the household responding and not responding, and the month and year of the interview for each eligible sampling unit. The Family-Level File (Part 2) is made up of reconstructed variables from the person-level data of the basic module and includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, and employment status, along with industry and occupation. As part of the basic module, the Person-Level File (Part 3) provides information on all family members with respect to health status, limitation of daily activities, cognitive impairment, and health conditions. Also included are data on years at current residence, region variables, height, weight, bed days, doctor visits, hospital stays, and health care access and utilization. A randomly-selected adult in each family was interviewed for the Sample Adult File (Part 4) regarding respiratory conditions, renal conditions, AIDS, joint symptoms, health status, limitation of daily activities, and behaviors such as smoking, alcohol consumption, and physical activity. The Sample Child File (Part 5) provides information from a knowledgeable adult in the household on medical conditions of one child in the household, such as respiratory problems, seizures, allergies, and use of special equipment such as hearing aids, braces, or wheelchairs. Also included are questions regarding child behavior, the use of mental health services, and Attention Deficit Hyperactivity Disorder (ADHD). The Child Immunization File (Part 6) presents information from shot records and supplies vaccination status, along with the number and dates of shots, and information about the chicken pox vaccine. The Injury and Poison Data File (Part 7) contains episode-level data for injuries and poisonings and the Injury and Poison Verbatim File (Part 8) contains verbatim comments for both injuries and poisonings.
Curated
Restricted

Research on Pathways to Desistance [Maricopa County, AZ and Philadelphia County, PA]: Official Arrest Records, 2000-2010 [Restricted] (ICPSR 34605)

Released/updated on: 2014-07-24
Geographic coverage: United States, Phoenix, Arizona, Philadelphia, Pennsylvania
Time period: 2000-01-01--2010-01-01

The Pathways to Desistance study was a multi-site study that followed 1,354 serious juvenile offenders from adolescence to young adulthood in two locales between the years 2000 and 2010. Enrolled into the study were adjudicated youths from the juvenile and adult court systems in Maricopa County (Phoenix), Arizona (N=654), and Philadelphia County, Pennsylvania (N=700).

The official arrests records of all 1,354 youth were obtained from multiple sources. For arrest/petitions under the age of 18, this information is based on petitions appearing in the juvenile and adult court records in each site. In Philadelphia, this information was gathered based on a hand review of juvenile and adult court documents; in Phoenix, the information is based on reports from two computerized court tracking systems (JOLTS--Juvenile On-Line Tracking System for juvenile court information, ICIS--Maricopa County Superior Court database for adult court information). For arrests/petitions over 18, FBI arrest records are the source of information. There is no self-reported information contained in this set of data.

Information from these different data sources is consolidated into the following categories:

  1. Information regarding petitions with a date that falls prior to the baseline interview date ("prior petitions").
  2. Information regarding the study index petition (also called the "initial referring petition"; this is the adjudication that prompted study enrollment). Information regarding the study index petition can be found by accessing the "type" variable associated with the prior petitions (specific variable name: Official Record Prior PetitionXX: Petition type). Depending on the investigator's needs, this petition can remain combined with the "priors" or be used as a stand-alone petition.
  3. Information regarding arrests and court petitions with a date which falls after the baseline interview date in the Pathways study ("rearrests").
Curated
Simple Crosstabs

Risk Factors for AIDS Among Intravenous Drug Users Study, New York City, 1991-1995 (ICPSR 36215)

Released/updated on: 2015-06-30
Geographic coverage: New York City, United States
Time period: 1991-01-01--1995-01-01

The Risk Factors for AIDS among Intravenous Drug Users study is an ongoing series of cross-sectional studies that recruits participants from a storefront research site and from one of New York City's largest detoxification facilities. The goal of the study was to assess the potential effectiveness of HIV interventions by examining participants' drug use, risk behavior, and AIDS prevention knowledge and activities.

The dataset combines survey responses taken from interviews conducted at the Bellevue Methadone Maintenance Treatment Program, the Beth Israel Medical Center and from a high drug use area in Lower East Side of Manhattan. All participants were at least 18 years of age or older. Participants from the Beth Israel Medical Center and the Lower East Side were given face-to-face interviews based on a World Health Organization Multi-Centre questionnaire. Data from the Bellevue Methadone Maintenance Treatment Program were extracted from patients' clinical files. Minimal demographic and HIV risk behavior were included in the methadone patient responses in these data to protect their anonymity. Blood samples were taken from participants to test for HIV.

These data also contain information on topics including participant demographics, alcohol use, drug use, substance abuse treatment, needle sharing habits, sexual behavior, social networks, HIV testing services, as well as mental and physical health. Drugs use explored in this study includes heroin, cocaine, crack, methadone, amphetamines, ice, tranquilizers, barbiturates and other drugs.

This dataset is public-use. A restricted-use version of the dataset is also available with the associated study number 35078. There are 2,907 respondents and 902 variables in the dataset.

Curated
Restricted

Seek, Test, Treat Strategies for Vietnamese Drug Users: A Random Controlled Trial (Project VISTA), Vietnam, 2013 (ICPSR 39802)

Released/updated on: 2026-04-23
Geographic coverage: Asia, Vietnam (Socialist Republic)

This study is part of the Seek, Test, Treat and Retain (STTR) Collaboration Project that involved over twenty studies in the fields of HIV and drug abuse. All studies were independently developed, but were chosen for the collaboration because they focused on one or more steps of the HIV treatment cascade: Seek, Test, Treat and Retain. As part of STTR Collaboration Project, the studies were grouped into Criminal Justice-related studies and Vulnerable Population-related studies. The data collected by these studies included twelve common domains (e.g., Demographic characteristics, Mental Health) in each of which a shared questionnaire or instrument was taken up by the studies and adapted to fit the study.

Project VISTA offers drug users in Vietnam HIV testing and counseling (HTC). The primary outcome measure for this study was suppression of HIV-1 viral load below 400 copies/mL (measured at the 6 and 12 month visits). Secondary outcome measures included self-reported antiretroviral adherence, changes in CD4 cell counts, and quality of life.

A secondary seek/test study will be done to evaluate uptake of HIV testing and counseling by network referrals from the RCT participants (the data files will not be part of the STTR integrated data).

Curated
Simple Crosstabs

Strengthening Washington DC Families (SWFP) Project, 1998 - 2004 (ICPSR 34425)

Released/updated on: 2012-12-10
Geographic coverage: District of Columbia, United States, Maryland
Time period: 1998-11-01--2004-04-01

The Strengthening Washington DC Families (SWFP) Project examined the effectiveness of an evidence-based prevention program implemented on a sample of 715 families across mulitple settings in an urban area. The study area also included suburban Maryland. SWFP was set up as a true experimental design with families being randomly placed into one of four treatment conditions:

  • child skills training only
  • parent skills training only
  • parent and child skills training plus family skills training
  • minimal treatment controls

Entire families were assigned to one of the four treatment conditions. Data were collected from all family members who participated in the program. Thus the individual data files contain more than 715 records. The parent file contains 796 cases and the child file contains 961 cases.

The Strengthening Families Program is based on cognitive-behavioral social learning theory and family systems theory targeting elementary school-aged children. In this program parents receive training in parenting skills, children receive training primarily in social skills, and families receive family skills training. The aim of the program is to effectively reduce parent, child, and family risk factors for substance use and delinquency.

Curated
Restricted

Exploring the Drugs-Crime Connection Within the Electronic Dance Music and Hip Hop Nightclub Scenes in Philadelphia, Pennsylvania, 2005-2006 (ICPSR 21187)

Released/updated on: 2013-01-15
Geographic coverage: United States, Philadelphia, Pennsylvania
Time period: 2005-04-01--2006-12-01
To explore the relationship between alcohol, drugs, and crime in the electronic dance music and hip hop nightclub scenes of Philadelphia, Pennsylvania, researchers utilized a multi-faceted ethnographic approach featuring in-depth interviews with 51 respondents (Dataset 1, Initial Interview Qualitative Data) and two Web-based follow-up surveys with respondents (Dataset 2, Follow-Up Surveys Quantitative Data). Recruitment of respondents began in April of 2005 and was conducted in two ways. Slightly more than half of the respondents (n = 30) were recruited with the help of staff from two small, independent record stores. The remaining 21 respondents were recruited at electronic dance music or hip hop nightclub events. Dataset 1 includes structured and open-ended questions about the respondent's background, living situation and lifestyle, involvement and commitment to the electronic dance music and hip hop scenes, nightclub culture and interaction therein, and experiences with drugs, criminal activity, and victimization. Dataset 2 includes descriptive information on how many club events were attended, which ones, and the activities (including drug use and crime/victimization experiences) taking place therein. Dataset 3 (Demographic Quantitative Data) includes coded demographic information from the Dataset 1 interviews.
Curated
Restricted
Simple Crosstabs

Parents And Children Coping Together (PACT I Child), Los Angeles, California, 1997-2002 (ICPSR 35194)

Released/updated on: 2018-04-23
Geographic coverage: Los Angeles, California
Time period: 1997-01-01--2002-01-01

Parents And Children Coping Together (PACT) was designed to longitudinally assess mothers in Los Angeles county living with HIV (MLHs) and their young, well children age 5 to 11 years old. The PACT sample was followed every 6 months for 30 months. The study utilizes longitudinal data from children/adolescent and mother dyads to investigate the effects of maternal HIV and family variables on adolescent sexual behavior. Specific aims were to:

  1. Evaluate longitudinally youth adjustment (i.e., mental health, behavioral adjustment, social outcomes) including measures for young children. Measures included developmentally appropriate youth and maternal mental health measures (e.g., Children's Depression Inventory for youths age under 18; Beck Depression Inventory for youths age equal to or greater than 18), assessment of maternal physical health, assessment of child behaviors, and family functioning.
  2. Evaluate youth characteristics from across developmental periods that may moderate or mediate the impact of MLHs' chronic illness on patterns of youth adjustment over time, including: (a) background factors of age, gender, ethnicity; and (b) moderating and mediating factors, such as self-concept, family cohesion, the parent-child relationship, HIV/AIDS knowledge, perceived stigma, autonomy, and parent-adolescent separation.
  3. Evaluate maternal characteristics that may moderate or mediate the impact of MLHs' chronic illness on the youth (e.g., illness severity, mental health status, social support, parenting skills).
This collection is a part of the PACT series and contains the child datasets (baselines and follow-ups) from PACT 1. The instrument was divided into 16 sections. These sections are as follows: (A) PACT Child Demographic Questions, (B) Child Social Network, (C) Modified Items from My Family and Friends, (E) RCMAS, (F) Timeline, (G) CDI, (H) Dominic-R, (I) Hopelessness Scale for Children, (L) Children's Coping Strategies Checklist, (M) General Coping Efficacy (C-GCE), (N) Family Functioning (Four subscales), (P) Children's Self-Concept, (Q) Children's Health Knowledge and Attitude Items, (R) Items from Child Behavior Checklist, (S) Life Events and Family Routines, and (T) GLESC Positive Stable Events. Demographic variables include age, education, religious activities, and household structure.
Curated
Restricted

Epidemiologic Catchment Area Program Sites 1-4, 1979-1983 with National Death Index Data through 2007 (ICPSR 36621)

Released/updated on: 2017-10-17
Geographic coverage: North Carolina, Baltimore, New Haven, United States, Connecticut, Missouri, St. Louis, Durham, Maryland
Time period: 1979-01-01--1982-01-01, 1980-01-01--1983-01-01, 1979-01-01--2007-01-01

The Epidemiologic Catchment Area (ECA) program of research was initiated in response to the 1977 report of the President's Commission on Mental Health. The purpose was to collect data on the prevalence and incidence of mental disorders and on the use of and need for services by the mentally ill. Independent research teams at five universities (Yale University, Johns Hopkins University, Washington University, Duke University, and University of California at Los Angeles), in collaboration with the National Institute for Mental Health, conducted the studies with a core of common questions and sample characteristics. The sites were areas that had previously been designated as Community Mental Health Center catchment areas: New Haven, Connecticut, Baltimore, Maryland, St. Louis, Missouri, Durham, North Carolina, and Los Angeles, California. Each site sampled over 3,000 community residents and 500 residents of institutions, yielding 20,861 respondents overall. The longitudinal ECA design incorporated two waves of personal interviews administered one year apart and a brief telephone interview in between (for the household sample). The diagnostic interview used in the ECA was the NIMH Diagnostic Interview Schedule (DIS), Version III (with the exception of the Yale Wave I survey, which used Version II). Diagnoses were categorized according to the DIAGNOSTIC AND STATISTICAL MANUAL OF MENTAL DISORDERS, 3rd Edition (DSM-III). Diagnoses derived from the DIS include manic episode, dysthymia, bipolar disorder, single episode major depression, recurrent major depression, atypical bipolar disorder, alcohol abuse or dependence, drug abuse or dependence, schizophrenia, schizophreniform, obsessive compulsive disorder, phobia, somatization, panic, antisocial personality, and anorexia nervosa. The DIS uses the Mini-Mental State Examination (MMSE), which measures cognitive functioning, as an indirect measure of the DSM-III Organic Mental Disorders. In the ECA survey, this diagnosis is called cognitive impairment.

This collection features data from 17,327 participants across 2,005 variables. Data from the Los Angeles, California, Catchment (UCLA) are not included. Baseline data (Wave 1) and Wave 2 data were linked to the National Death Index through 2007, which includes primary and contributing causes of death, International Classification of Disease (ICD) codes, and nature of injury variables.

Curated
Restricted

Pittsburgh Youth Study Middle Sample (1987 - 1991) [Pittsburgh, Pennsylvania] (ICPSR 36454)

Released/updated on: 2017-01-06
Geographic coverage: United States, Pennsylvania, Pittsburgh
Time period: 1987-01-01--1991-01-01

The Pittsburgh Youth Study (PYS) is part of the larger "Program of Research on the Causes and Correlates of Delinquency" initiated by the Office of Juvenile Justice and Delinquency Prevention in 1986. PYS aims to document the development of antisocial and delinquent behavior from childhood to early adulthood, the risk factors that impinge on that development, and help seeking and service provision of boys' behavior problems. The study also focuses on boys' development of alcohol and drug use, and internalizing problems.

PYS consists of three samples of boys who were in the first, fourth, and seventh grades in Pittsburgh, Pennsylvania public schools during the 1987-1988 academic year (called the youngest, middle, and oldest sample, respectively). Using a screening risk score that measured each boy's antisocial behavior, boys identified at the top 30 percent within each grade sample on the screening risk measure (n=~250), as well as an equal number of boys randomly selected from the remainder (n=~250), were selected for follow-up. Consequently, the final sample for the study consisted of 1,517 total students selected for follow-up. 506 of these students were in the oldest sample, 508 were in the middle sample, and 503 were in the youngest sample.

Assessments were conducted semiannually and then annually using multiple informants (i.e., boys, parents, teachers) between 1987 and 2010. The youngest sample was assessed from ages 6-19 and again at ages 25 and 28. The middle sample was assessed from ages 9-13 and again at age 23. The oldest sample was assessed from ages 13-25, with an additional assessment at age 35. Information has been collected on a broad range of risk and protective factors across multiple domains (e.g., individual, family, peer, school, neighborhood). Measures of conduct problems, substance use/abuse, criminal behavior, mental health problems have been collected.

This study collection contains only the middle sample respondents.

Curated

Older Drug Users: A Life Course Study of Turning Points in Drug Use [in a large Southeastern Metropolitan Area], 2009-2010 (ICPSR 34296)

Released/updated on: 2012-07-31
Geographic coverage: United States
Time period: 2009-01-01--2010-01-01

The Older Drug Users study was a mixed method, retrospective longitudinal study that interviewed 92 respondents in a large southeastern metropolitan area from January 2009 to August 2010. The goal of the study was to provide in-depth life history on the drug use trajectories of older drug users, specific turning points in drug use patterns, and drug-related health risks over a person's life course.

Quantitiave and qualititative data was collected from each respondent. Two questionnaires were used to collect the quantitative data. The first questionnaire asked about the person's basic demographic information (gender, race, age, and education), health history (has the person been diagnosed with HIV, AIDS, or Hepatitis C), and drug use (route and frequency) and treatment in the past 30 days across ten different substances (tobacco, alcohol, marijuana, hallucinogens/LSD/Ecstasy/club drugs, prescription pills, cocaine, crack, heroin, amphetamines, and methamphetamine).

A second questionnaire was used to serve as a retrospective life history of the person. The questionnaire asked about the same drug use and treatment of the same ten drugs but this time looking at the entire year. Questions were also asked concerning the person's living arrangement, employment, family roles, drug roles, and sexual activity over the course of the year. The questions were repeated for every year of the person's life from birth up to the time the person was interviewed.

Curated
Restricted

Pittsburgh Youth Study Youngest Sample (1987 - 2001) [Pittsburgh, Pennsylvania] (ICPSR 36453)

Released/updated on: 2017-01-06
Geographic coverage: United States, Pennsylvania, Pittsburgh
Time period: 1987-01-01--1991-01-01, 1991-01-01--2001-01-01, 2006-01-01--2007-01-01, 2009-01-01--2010-01-01

The Pittsburgh Youth Study (PYS) is part of the larger "Program of Research on the Causes and Correlates of Delinquency" initiated by the Office of Juvenile Justice and Delinquency Prevention in 1986. PYS aims to document the development of antisocial and delinquent behavior from childhood to early adulthood, the risk factors that impinge on that development, and help seeking and service provision of boys' behavior problems. The study also focuses on boys' development of alcohol and drug use, and internalizing problems.

PYS consists of three samples of boys who were in the first, fourth, and seventh grades in Pittsburgh, Pennsylvania public schools during the 1987-1988 academic year (called the youngest, middle, and oldest sample, respectively). Using a screening risk score that measured each boy's antisocial behavior, boys identified at the top 30 percent within each grade sample on the screening risk measure (n=~250), as well as an equal number of boys randomly selected from the remainder (n=~250), were selected for follow-up. Consequently, the final sample for the study consisted of 1,517 total students selected for follow-up. 506 of these students were in the oldest sample, 508 were in the middle sample, and 503 were in the youngest sample.

Assessments were conducted semiannually and then annually using multiple informants (i.e., boys, parents, teachers) between 1987 and 2010. The youngest sample was assessed from ages 6-19 and again at ages 25 and 28. The middle sample was assessed from ages 9-13 and again at age 23. The oldest sample was assessed from ages 13-25, with an additional assessment at age 35. Information has been collected on a broad range of risk and protective factors across multiple domains (e.g., individual, family, peer, school, neighborhood). Measures of conduct problems, substance use/abuse, criminal behavior, mental health problems have been collected.

This study collection contains only the youngest sample respondents.

Curated
Restricted

The Dynamic Context of Teen Dating Violence in Adolescent Relationships, Baltimore, Maryland, 2014-2016 (ICPSR 36869)

Released/updated on: 2018-05-23
Geographic coverage: Baltimore, United States, Maryland
Time period: 2014-01-01--2016-01-01

These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed.

Teenage adolescent females residing in Baltimore, Maryland who were involved in a relationship with a history of violence were sought after to participate in this research study. Respondents were interviewed and then followed through daily diary entries for several months. The aim of the research was to understand the context regarding teen dating violence (TDV). Prior research on relationship context has not focused on minority populations; therefore, the focus of this project was urban, predominantly African American females.

The available data in this collection includes three SAS (.sas7bdat) files and a single SAS formats file that contains variable and value label information for all three data files. The three data files are:

  • final_baseline.sas7bdat (157 cases / 252 variables)
  • final_partnergrid.sas7bdat (156 cases / 76 variables)
  • hart_final_sas7bdata (7004 cases / 23 variables)
Curated
Simple Crosstabs

Criminal Justice Drug Abuse Treatment Studies (CJ-DATS): Performance Indicators for Corrections (PIC), 2002-2006 [United States] (ICPSR 27942)

Released/updated on: 2013-05-08
Geographic coverage: United States
Time period: 2002-01-01--2006-01-01

In 2002, the National Institute on Drug Abuse (NIDA) funded the Criminal Justice Drug Abuse Treatment Studies (CJ-DATS) cooperative agreement. The Institute of Behavioral Research at Texas Christian University (TCU) was one of nine National Research Centers selected to study current drug treatment practices and outcomes in correctional settings and to examine strategies for improving treatment services for drug-involved offenders.

The specific aims of the PIC study were to:

  1. Cross sectionally test and adapt the TCU CJ-CEST, BOP, and NDRI CAI assessments for use in multiple correctional settings;
  2. To examine agency and program records of client progress relevant to treatment process; and to
  3. Revise the assessments as necessary for use in longitudinal assessment protocols and CJ Management Information Systems (MIS).

During the first data collection period, Wave 1, a total of 3,266 inmates were surveyed from research centers based out of Texas Christian University, the University of Delaware, the University of Kentucky, University of California, Los Angeles (UCLA), and the National Development and Research Institute (NDRI). After psychometrics were run and the forms revised slightly, a second administration took place but this time only at two centers (TCU and Delaware). During Wave 2 a total of 1,421 clients participated in the survey.

Curated
Restricted

Northwestern Juvenile Project (Cook County, Illinois): Follow-up 2, 1999 - 2005 (ICPSR 36629)

Released/updated on: 2018-06-08
Geographic coverage: United States, Chicago, Illinois
Time period: 1977-01-01--2005-01-01

This study contains data from the second follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. This second follow-up occurred approximately 3.5 years after the baseline interview and focused on the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risky behaviors.

The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Researchers studied changes in disorders over time (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. The NJP addressed the patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors are interrelated.

The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1,005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. A random subsample of 997 of the baseline participants were chosen for second follow-up interviews. Researchers tracked participants from the time they left detention and re-interviewed them regardless of where they were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.

The study was funded by OJJDP, several institutes at the National Institutes of Health, and other federal agencies and private foundations. The National Institutes of Health funded an additional component on HIV/AIDS risk behaviors.

Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2021 (ICPSR 38503)

Released/updated on: 2022-10-31
Geographic coverage: United States

This survey of 12th-grade students is part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and drug use. There are about 1,400 variables across the questionnaires. Drugs covered by this survey include tobacco, smokeless tobacco, alcohol, marijuana, hashish, prescription medications, over-the-counter medications, LSD, hallucinogens, amphetamines (stimulants), Ritalin (methylphenidate), Quaaludes (methaqualone), barbiturates (tranquilizers), cocaine, crack cocaine, GHB (gamma hydroxy butyrate), ecstasy, methamphetamine, and heroin. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2021:

  • Data collection resumed in 2021, with a change to all web-based surveys.
  • Students completed the surveys on their personal or school-provided device.
  • Non-survey variables have been changed or added to facilitate analyses. For details, please see the codebook section "MTF Variable Information - Non-survey variables included in the data files - Survey mode and design variables for 2021"
  • Information about "screen break" issues, where series of questions were originally presented differently in the web-based survey as compared to the 2019/2020 tablet surveys. Please see the codebook and Appendix D for details.
  • For 12th grade: two additional changes to the survey presentation. Please see the codebook section "MTF Variable Information - Non-survey variables included in the data files", and respective appendices for details.
  • Introduction of randomized blocks of questions presented to students. Please see Appendix E.
  • Test of presentation of items in the substance use consequences section on form 3. Please see Appendix F.
  • Additional information is documented in the MTFQchanges2021byForm.pdf and MTFQchanges2021byType.pdf files available for download.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2022 (ICPSR 38882)

Released/updated on: 2023-10-31
Geographic coverage: United States

This survey of 12th-grade students is part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and drug use. There are about 1,400 variables across the questionnaires. Drugs covered by this survey include tobacco, smokeless tobacco, alcohol, marijuana, hashish, prescription medications, over-the-counter medications, LSD, hallucinogens, amphetamines (stimulants), Ritalin (methylphenidate), Quaaludes (methaqualone), barbiturates (tranquilizers), cocaine, crack cocaine, GHB (gamma hydroxy butyrate), ecstasy, methamphetamine, and heroin. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2022:

  • Continuation of randomized blocks of questions presented to students. Please see Appendix D of the codebook.
  • Change to the question stem for some lifetime, 12 month, and 30 day heroin and marijuana use questions. Please see the Highlights for 2022 section in the codebook for more details.
  • Change to the heroin use questions: Separate questions about heroin use with a needle and heroin use without a needle for lifetime, past 12 months, and past 30 day timeframes are no longer asked. The separate questions have been replaced by the single question, "On how many occasions (if any), have you taken heroin...
    • ...in your lifetime?
    • ...during the last 12 months?
    • ...during the last 30 days?
Please see the Highlights for 2022 section in the codebook for more details.

Additional information is documented in the MTFQchanges2022byForm.pdf and MTFQchanges2022byType.pdf files available for download.

Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2022 (ICPSR 38883)

Released/updated on: 2023-10-31
Geographic coverage: United States
Time period: 2022-01-01--2022-12-31

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and drug use. There are more than 450 variables across the questionnaires. Drugs covered by this survey include amphetamines (stimulants), barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, tobacco, smokeless tobacco, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2022:

  • Change to the heroin use questions: Separate questions about heroin use with a needle and heroin use without a needle for lifetime, past 12 months, and past 30 day timeframes are no longer asked. The separate questions have been replaced by the single question, "On how many occasions (if any), have you taken heroin...
    • ...in your lifetime?
    • ...during the last 12 months?
    • ...during the last 30 days?
Please see the Highlights for 2022 section in the codebook for more details.
  • Change to the question stem for some lifetime, 12 month, and 30 day marijuana use questions: Please see the Highlights for 2022 section in the codebook for more details.
  • Additional information is documented in the MTFQchanges2022byForm.pdf and MTFQchanges2022byType.pdf files available for download.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2020 (ICPSR 38189)

Released/updated on: 2021-10-26
Geographic coverage: United States
Time period: 2020-01-01--2020-12-31

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and drug use. There are more than 500 variables across the questionnaires. Drugs covered by this survey include amphetamines (stimulants), barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, tobacco, smokeless tobacco, vaping, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2020:

  • All students recorded their survey answers on tablets that the project brought to the schools, preloaded with the MTF surveys.
  • Data collection was halted prematurely on March 15, 2020 when the University of Michigan stopped all projects that involved face-to-face data collection because of COVID-19 concerns. This resulted in a 2020 sample size about 25% the size of a regular data collection.
  • Guidance for combining grades for analysis: see Appendix C of the codebook
  • Information about potential mode effects for questions on student attitudes and beliefs when comparing previous years' paper-based survey responses to the current tablet method of collection. Please see the codebook Introduction - Survey Mode section for details.
Curated
Simple Crosstabs

Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Public-Use Files (ICPSR 37786)

Released/updated on: 2025-06-27
Geographic coverage: United States
Time period: 2017-01-01--2018-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.

Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).

Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.

Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,395 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.

Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,323 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,625 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, and 5.

The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.

Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 908 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.

Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.

The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.

Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, and PATH-ATS, indicating if participants had ever/never used various tobacco products as of the Wave 5.5/PATH-ATS data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5/PATH-ATS data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 2,760 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 1,889 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.

There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.

Curated
Restricted

Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Restricted-Use Files (ICPSR 37519)

Released/updated on: 2026-04-13
Geographic coverage: United States
Time period: 2017-01-01--2018-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.

Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).

Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.

Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,617 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth"

Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.

Dataset 1402 (DS1402) contains the Wave 4.5 State Identifier data for Youth and Parents and has 5 variables and 13,131 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in this dataset represent participants' state of residence at the time of Wave 4.5.

Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,619 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,871 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5 and 5.

The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.

Dataset 2401 (DS2401) contains the Wave 5.5 State Identifier data for Adults and has 5 variables and 3,628 cases. Dataset 2402 (DS2402) contains the Wave 5.5 State Identifier data for Youth and Parents and has 5 variables and 7,129 cases. The same 5.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.5.

Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, and Wave 5.5 indicating if participants had ever/never used various tobacco products as of the Wave 5.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 977 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.

Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.

The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH_-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.

Dataset 3401 (DS3401) contains the PATH-ATS State Identifier data and has 5 variables and 8,874 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in this dataset represents participants' state of residence at the time of PATH-ATS.

Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 3,142 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 2,169 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.

There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.

Dataset 4401 (DS4401) contains the Wave 7.5 State Identifier data for Adults and has 5 variables and 7,961 cases. Dataset 4402 (DS4402) contains the Wave 7.5 State Identifier data for Youth and Parents and has 5 variables and 8,949 cases. The same 7.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.5.

Dataset 4503 (DS4503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, PATH-ATS, Wave 6, Wave 7, and Wave 7.5 indicating if participants had ever/never used various tobacco products as of the Wave 7.5 data collection period. This data file contains 25 variables for all 82,139 study participants as of the Wave 7.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 7.5. This data file contains 53 variables and 157 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7.5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.5.

Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2021 (ICPSR 38502)

Released/updated on: 2022-10-31
Geographic coverage: United States
Time period: 2021-01-01--2021-12-31

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and drug use. There are more than 450 variables across the questionnaires. Drugs covered by this survey include amphetamines (stimulants), barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, tobacco, smokeless tobacco, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2021:

  • Data collection resumed in 2021, with a change to all web-based surveys.
  • Students completed the surveys on their personal or school-provided device.
  • Non-survey variables have been changed or added to facilitate analyses. For details, please see the codebook section "MTF Variable Information - Non-survey variables included in the data files - Survey mode and design variables for 2021".
  • Information about "screen break" issues, where series of questions were originally presented differently in the web-based survey as compared to the 2019/2020 tablet surveys. Please see the codebook and Appendix D for details.
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2020 (ICPSR 38156)

Released/updated on: 2021-10-26
Geographic coverage: United States

This survey of 12th-grade students is part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and drug use. There are about 1,400 variables across the questionnaires. Drugs covered by this survey include tobacco, smokeless tobacco, alcohol, marijuana, hashish, prescription medications, over-the-counter medications, LSD, hallucinogens, amphetamines (stimulants), Ritalin (methylphenidate), barbiturates (tranquilizers), cocaine, crack cocaine, GHB (gamma hydroxybutyrate), ecstasy, methamphetamine, and heroin. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2020:

  • All students recorded their survey answers on tablets that the project brought to the schools, preloaded with the MTF surveys.
  • Data collection was halted prematurely on March 15, 2020 when the University of Michigan stopped all projects that involved face-to-face data collection because of COVID-19 concerns. This resulted in a 2020 sample size about 25% the size of a regular data collection.
  • Guidance for combining grades for analysis: see Appendix C of the codebook.
  • Information about potential mode effects for questions on student attitudes and beliefs when comparing previous years' paper-based survey responses to the current tablet method of collection. Please see the codebook Introduction - Survey Mode section for details.
Curated
Restricted

Detroit [Michigan] Neighborhood Health Study, 2008-2013 (ICPSR 37038)

Released/updated on: 2021-10-07
Geographic coverage: Detroit, United States, Michigan
Time period: 2008-01-01--2009-01-01, 2009-01-01--2010-01-01, 2010-01-01--2011-01-01, 2011-01-01--2012-01-01

The Detroit Neighborhood Health Study (DNHS) is a prospective, representative longitudinal cohort study of predominantly African American adults living in Detroit, Michigan. The main purpose of the study was to determine the predictive effects of ecological stressors, such as income distribution and residential segregation, on the development of post-traumatic stress disorder (PTSD), substance use, and other psychological and behavioral outcomes. An additional purpose was to study the interrelationships between ecological stressors, exposure to potentially traumatic events (PTEs), PTSD, substance use, and immune function. The study team hypothesized that exposure to ecological stressors would influence the risk of PTE exposure, PTSD, substance use, other psychological outcomes, and the relationships between these factors.

The current collection includes data from all 5 waves of the study. Cohort participants were initially recruited in 2008 with a dual-frame probability design, using telephone numbers obtained from the U.S. Postal Service Delivery Sequence Files as well as a listed-assisted random-digit-dial frame. Individuals without listed landlines or telephones and individuals with only a cell phone listed were invited to participate through a postal mail effort. Participants completed a 40 minute, structured telephone interview annually between 2008-2012 to assess perceptions of participants' neighborhoods, mental and physical health status, social support, exposure to traumatic events, and alcohol and tobacco use. In addition, the study team completed a structured assessment of Detroit's 54 neighborhoods in order to describe the characteristics of respondents' neighborhoods. The assessment included information about the quality of housing exteriors; presence of graffiti, abandoned cars, alcohol and tobacco advertisements, and security warning signs; presence of vacant buildings; and street and traffic noise levels.

All survey participants were offered the opportunity to provide a blood specimen (venipuncture, blood spot, or saliva) for immune and inflammatory marker testing as well as genetic testing of DNA. Participants received an additional $25USD if they elected to give a sample. Informed consent was obtained at the beginning of each interview and again at specimen collection. However, these specimens are not included as part of this data collection.

For more information about the study, please visit the Detroit Neighborhood Health Study website.

Genotypic data from DNHS are available on the NIH database of Genotypes and Phenotypes (dbGaP).

Curated

Treatment Episode Data Set -- Admissions (TEDS-A) -- Concatenated, 1992 to 2012 (ICPSR 25221)

Released/updated on: 2015-11-23
Geographic coverage: United States
Time period: 1992-01-01--2012-01-01

The Treatment Episode Data Set -- Admissions (TEDS-A) is a national census data system of annual admissions to substance abuse treatment facilities. TEDS-A provides annual data on the number and characteristics of persons admitted to public and private substance abuse treatment programs that receive public funding. The unit of analysis is a treatment admission. TEDS consists of data reported to state substance abuse agencies by the treatment programs, which in turn report it to SAMHSA.

A sister data system, called the Treatment Episode Data Set -- Discharges (TEDS-D), collects data on discharges from substance abuse treatment facilities. The first year of TEDS-A data is 1992, while the first year of TEDS-D is 2006.

TEDS variables that are required to be reported are called the "Minimum Data Set (MDS)", while those that are optional are called the "Supplemental Data Set (SuDS)".

Variables in the MDS include: information on service setting, number of prior treatments, primary source of referral, gender, race, ethnicity, education, employment status, substance(s) abused, route of administration, frequency of use, age at first use, and whether methadone was prescribed in treatment. Supplemental variables include: diagnosis codes, presence of psychiatric problems, living arrangements, source of income, health insurance, expected source of payment, pregnancy and veteran status, marital status, detailed not in labor force codes, detailed criminal justice referral codes, days waiting to enter treatment, and the number of arrests in the 30 days prior to admissions (starting in 2008) .

Substances abused include alcohol, cocaine and crack, marijuana and hashish, heroin, nonprescription methadone, other opiates and synthetics, PCP, other hallucinogens, methamphetamine, other amphetamines, other stimulants, benzodiazepines, other non-benzodiazepine tranquilizers, barbiturates, other non-barbiturate sedatives or hypnotics, inhalants, over-the-counter medications, and other substances.

Created variables include total number of substances reported, intravenous drug use (IDU), and flags for any mention of specific substances.

Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2019 (ICPSR 37841)

Released/updated on: 2020-10-29
Geographic coverage: United States

These surveys of 12th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students are randomly assigned to complete one of six questionnaires, each with a different subset of topical questions, but all containing a set of "core" questions on demographics and drug use. There are about 1,400 variables across the questionnaires. Drugs covered by this survey include tobacco, smokeless tobacco, alcohol, marijuana, hashish, prescription medications, over-the-counter medications, LSD, hallucinogens, amphetamines (stimulants), Ritalin (methylphenidate), sedatives/barbiturates, tranquilizers, cocaine, crack cocaine, GHB (gamma hydroxybutyrate), ecstasy, methamphetamine, and heroin. Other topics include attitudes toward religion, changing roles for women, educational aspirations, self-esteem, exposure to drug education, and violence and crime (both in and out of school).

Highlights for 2019:

  • Change in methodology: half of the MTF schools completed in-class surveys on tablets loaded with the survey; the other half completed traditional paper-and-pencil surveys. Also see the Methodology section on this page for an overview and the codebook for details.
  • Expansion and revision of the study documentation in the codebook
  • New documentation available for download detailing the question adds/drops/changes to the surveys
  • Availability of supplemental data sets for previously unreleased questions

Two supplemental data files (DS8 and DS9) have been included this year by the Principal Investigators. These files each include three administrative variables for year (V1), form (V3), and ID (RESPONDENT_ID) along with a few additional variables of survey questions not previously released for Form 5 (DS8) and Form 6 (DS9) between the years 2016 to 2018. These same variables are already present in the main 2019 data files for Form 5 (DS6) and Form 6 (DS7). The front section of the codebook provides details about each of the variables. There are also instructions on how to merge the supplemental data on to the main data files for the previous three years:

  • 2018 data (ICPSR 37416)
  • 2017 data (ICPSR 37182)
  • 2016 data (ICPSR 36798)
Curated
Simple Crosstabs

Monitoring the Future: A Continuing Study of American Youth (8th- and 10th-Grade Surveys), 2019 (ICPSR 37842)

Released/updated on: 2020-10-29
Geographic coverage: United States

These surveys of 8th- and 10th-grade students are part of a series that explores changes in important values, behaviors, and lifestyle orientations of contemporary American youth. Students in each grade are randomly assigned to complete one of four questionnaires, each with a different subset of topical questions but containing a set of "core" questions on demographics and drug use. There are more than 500 variables across the questionnaires. Drugs covered by this survey include amphetamines (stimulants), barbiturates (tranquilizers), other prescription drugs, over-the-counter medications, tobacco, smokeless tobacco, vaping, alcohol, inhalants, steroids, marijuana, hashish, LSD, hallucinogens, cocaine, crack, ecstasy, methamphetamine, and injectable drugs such as heroin.

Highlights for 2019:

  • Change in methodology: half of the MTF schools completed in-class surveys on tablets loaded with the survey; the other half completed traditional paper-and-pencil surveys. Also see the Methodology section on this page for an overview and the codebook for details.
  • Expansion and revision of the study documentation in the codebook
  • New documentation available for download detailing the question adds/drops/changes to the surveys
  • Availability of supplemental data sets for previously unreleased questions

A supplemental data file (DS2) has been included this year by the Principal Investigators. This file includes 17 variables (6 administrative and 11 survey questions) and 93,034 cases from the years 2016 to 2018. These same 11 variables are already in the main data file (DS1) for 2019. The front section of the codebook provides details about each of the variables. There are also instructions on how to merge the supplemental data on to the main data files for the previous three years:

  • 2018 data (ICPSR 37415)
  • 2017 data (ICPSR 37183)
  • 2016 data (ICPSR 36799)
Curated
Restricted

Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files (ICPSR 36231)

Released/updated on: 2026-04-21
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study.

Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview.

Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases.

Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment.

Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 1.

Dataset 1801 (DS1801) contains Location Characteristics for Wave 1 Adults. This data file contains 4 variables and 32,320 cases.

Dataset 1802 (DS1802) contains Location Characteristics for Wave 1 Youth. This data file contains 4 variables and 13,651 cases.

Dataset 1901 (DS1901) contains Study Research Derived Variables for Wave 1 Adults created by PATH Study analysts. This data file contains 104 variables and 32,320 cases.

Dataset 1902 (DS1902) contains Study Research Derived Variables for Wave 1 Youth created by PATH Study analysts. This data file contains 89 variables and 13,651 cases.

Dataset 2011 (DS2011) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,421 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.

Dataset 2012 (DS2012) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,596 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."

Dataset 2411 (DS2411) contains the Wave 2 State Identifier data for Adults and has 5 variables and 28,362 cases. Dataset 2412 (DS2412) contains the Wave 2 State Identifier data for Youth and Parents and has 5 variables and 12,172 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 2.

Dataset 2611 (DS2611) contains the Tobacco Universal Product Code (UPC) data from Wave 2. This data file contains 32 variables and 7,295 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 2. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 2.

Dataset 2801 (DS2801) contains Location Characteristics for Wave 2 Adults. This data file contains 4 variables and 28,362 cases.

Dataset 2802 (DS2802) contains Location Characteristics for Wave 2 Youth. This data file contains 4 variables and 12,172 cases.

Dataset 2901 (DS2901) contains Study Research Derived Variables for Wave 2 Adults created by PATH Study analysts. This data file contains 178 variables and 28,362 cases.

Dataset 2902 (DS2902) contains Study Research Derived Variables for Wave 2 Youth created by PATH Study analysts. This data file contains 123 variables and 12,172 cases.

Dataset 3011 (DS3011) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,359 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 3012 (DS3012) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,492 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 3111, 3211, 3112, and 3212 (DS3111, DS3211, DS3112, and DS3212) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, starting with Wave 3, the weight variables have been separated into individual data files. The "all-waves" weight files contain weights for respondents who completed an interview for all waves in which they were old enough to do so or verified their information with the study for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.

Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 3411 (DS3411) contains the Wave 3 State Identifier data for Adults and has 5 variables and 28,148 cases. Dataset 3412 (DS3412) contains the Wave 3 State Identifier data for Youth and Parents and has 5 variables and 11,814 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 3.

Dataset 3611 (DS3611) contains the Tobacco Universal Product Code (UPC) data from Wave 3. This data file contains 32 variables and 6,768 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 3. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 3.

Dataset 3801 (DS3801) contains Location Characteristics for Wave 3 Adults. This data file contains 4 variables and 28,148 cases.

Dataset 3802 (DS3802) contains Location Characteristics for Wave 3 Youth. This data file contains 4 variables and 11,814 cases.

Dataset 3901 (DS3901) contains Study Research Derived Variables for Wave 3 Adults created by PATH Study analysts. This data file contains 107 variables and 28,148 cases.

Dataset 3902 (DS3902) contains Study Research Derived Variables for Wave 3 Youth created by PATH Study analysts. This data file contains 88 variables and 11,814 cases.

Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,504 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,600 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 4111, 4211, 4321, 4112, 4212, and 4322 (DS4111, DS4211, DS4321, DS4112, DS4212, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.

Dataset 4401 (DS4401) contains the Wave 4 State Identifier data for Adults and has 5 variables and 33,822 cases. Dataset 4402 (DS4402) contains the Wave 4 State Identifier data for Youth and Parents and has 5 variables and 14,798 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 4. For adults and youth from the replenishment sample, the values also represent state of residence at the time of recruitment.

Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires, indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 4. This data file contains 32 variables and 7,684 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 4. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 4.

Dataset 4801 (DS4801) contains Location Characteristics for Wave 4 Adults. This data file contains 4 variables and 33,822 cases.

Dataset 4802 (DS4802) contains Location Characteristics for Wave 4 Youth. This data file contains 4 variables and 14,798 cases.

Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,606 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire and 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,776 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview and 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.

Dataset 5401 (DS5401) contains the Wave 5 State Identifier data for Adults and has 5 variables and 34,309 cases. Dataset 5402 (DS5402) contains the Wave 5 State Identifier data for Youth and Parents and has 5 variables and 12,098 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.

Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 5601 (DS5601) contains the Tobacco Universal Product Code (UPC) data from Wave 5. This data file contains 33 variables and 6,678 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 5.

Dataset 5801 (DS5801) contains Location Characteristics for Wave 5 Adults. This data file contains 4 variables and 34,309 cases.

Dataset 5802 (DS5802) contains Location Characteristics for Wave 5 Youth. This data file contains 4 variables and 12,098 cases.

Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,935 variables and 30,516 cases

Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 2,080 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth Interview and 60 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS62221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

Dataset 6401 (DS6401) contains the Wave 6 State Identifier data for Adults and has 5 variables and 30,516 cases. Dataset 6402 (DS6402) contains the Wave 6 State Identifier data for Youth and Parents and has 5 variables and 5,652 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 6.

Dataset 6503 (DS6503) contains data derived from responses to questionnaires in Waves 1-6 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 6601 (DS6601) contains the Tobacco Universal Product Code (UPC) data from Wave 6. This data file contains 53 variables and 5,408 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 6. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 6.

Dataset 6801 (DS6801) contains Location Characteristics for Wave 6 Adults. This data file contains 4 variables and 30,516 cases.

Dataset 6802 (DS6802) contains Location Characteristics for Wave 6 Youth. This data file contains 4 variables and 5,652 cases.

Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 3,221 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 2,171 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.

Dataset 7401 (DS7401) contains the Wave 7 State Identifier data for Adults and has 5 variables and 30,801 cases. Dataset 7402 (DS7402) contains the Wave 7 State Identifier data for Youth and Parents and has 5 variables and 10,834 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.

Dataset 7503 (DS7503) contains data derived from responses to questionnaires in Waves 1-7 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 7 data collection period. This data file contains 26 variables for all 82,139 study participants as of the Wave 7 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 7601 (DS7601) contains the Tobacco Universal Product Code (UPC) data from Wave 7. This data file contains 53 variables and 4,533 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.

Dataset 7801 (DS7801) contains Location Characteristics for Wave 7 Adults. This data file contains 4 variables and 30,801 cases.

Dataset 7802 (DS7802) contains Location Characteristics for Wave 7 Youth. This data file contains 4 variables and 10,834 cases.

Dataset 8001 (DS8001) contains the data from the Wave 8 Adult Questionnaire. This data file contains 3,467 variables and 31,477 cases. Of these cases, 30,021 are continuing adults having completed a prior Adult Questionnaire and 1,456 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 8002 (DS8002) contains the data from the Wave 8 Youth and Parent Questionnaire. This data file contains 2,393 variables and 8,002 cases. Of these cases, 7,046 are continuing youth having completed a prior Youth Interview and 956 are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 8111, 8121, 8122, 8211, 8221, 8231, 8232, 8711, 8721, 8722, 8731, and 8732 (DS8111, DS8121, DS8122, DS8211, DS8221, DS8231, DS8232, DS8711, 8DS721, DS8722, DS8731, and DS8732) are data files comprising the weight variables for Wave 8. In Wave 8, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and 7. Note that only adults have "all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and 7.

There are three separate sets of files with "single-wave" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 8, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 8, regardless of their participation in the intervening waves. Note that only adults have "single-wave" weights for the Wave 1 and Wave 4 Cohorts; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8 and youth from the Wave 4 Cohort were selected as shadow youth so they do not have any interview data from Wave 4. The "single wave" weights files for the Wave 7 Cohort contain weights for participants who completed an interview in Wave 7 and in Wave 8.

There are also three separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, 7 and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. Note that only adults have "special collection all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, 7, and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. The "special collection all-waves" weight files for the Wave 7 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 7 and the special collection in Wave 7.5.

Dataset 8401 (DS8401) contains the Wave 8 State Identifier data for Adults and has 5 variables and 31,477 cases. Dataset 8402 (DS8402) contains the Wave 8 State Identifier data for Youth and Parents and has 5 variables and 8,002 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 8.

Dataset 8801 (DS8801) contains Location Characteristics for Wave 8 Adults. This data file contains 4 variables and 31,477 cases.

Dataset 8802 (DS8802) contains Location Characteristics for Wave 8 Youth. This data file contains 4 variables and 8,002 cases.

Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. In all waves of data collection, less than 0.5 percent of the parents did not complete an interview. Most questions are asked about the child.

When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).

Curated
Simple Crosstabs

Population Assessment of Tobacco and Health (PATH) Study [United States] Public-Use Files (ICPSR 36498)

Released/updated on: 2025-04-08
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 14 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview.

Dataset 1002 (DS1002) contains the data from the Youth and Parent Questionnaire. This file contains 1,228 variables and 13,651 cases.

Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."

Dataset 3001 (DS3001) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,139 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 3002 (DS3002) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,309 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 3101, 3102, 3201 and 3202 (DS3101, DS3102, DS3201, and DS3202) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, in Wave 3, the weight variables have been separated into individual data files for Adult and Youth Questionnaires. The "all-waves" weight files contain weights for those respondents who have completed an interview during all three waves of data collection. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.

Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,182 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,389 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 4111, 4112, 4211, 4212, 4321, and 4322 (DS4111, DS4112, DS4211, DS4212, DS4321, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.

Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,315 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire, 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,530 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview, 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contains weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.

Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,589 variables and 30,516 cases. Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 1,822 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth interview and 30 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS6221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 2,813 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 1,897 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.

Dataset 6503 (DS6503) contains data derived from responses to Wave 1-6 (including Wave 4.5, Wave 5.5, and PATH-ATS) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. Across all waves of data collection, an average of 0.6 percent of the parents did not complete an interview. Most questions are asked about the child.

When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).