Comprehensive Investigation of the Role of Individuals, the Immediate Social Environment, and Neighborhoods in Trajectories of Adolescent Antisocial Behavior in Chicago, Illinois, 1994-2002 (ICPSR 33921)
Monitoring Drug Epidemics and the Markets That Sustain Them, Arrestee Drug Abuse Monitoring (ADAM) and ADAM II Data, 2000-2003 and 2007-2010 (ICPSR 33201)
Treatment Episode Data Set -- Admissions (TEDS-A) -- Concatenated, 1992 to 2012 (ICPSR 25221)
The Treatment Episode Data Set -- Admissions (TEDS-A) is a national census data system of annual admissions to substance abuse treatment facilities. TEDS-A provides annual data on the number and characteristics of persons admitted to public and private substance abuse treatment programs that receive public funding. The unit of analysis is a treatment admission. TEDS consists of data reported to state substance abuse agencies by the treatment programs, which in turn report it to SAMHSA.
A sister data system, called the Treatment Episode Data Set -- Discharges (TEDS-D), collects data on discharges from substance abuse treatment facilities. The first year of TEDS-A data is 1992, while the first year of TEDS-D is 2006.
TEDS variables that are required to be reported are called the "Minimum Data Set (MDS)", while those that are optional are called the "Supplemental Data Set (SuDS)".
Variables in the MDS include: information on service setting, number of prior treatments, primary source of referral, gender, race, ethnicity, education, employment status, substance(s) abused, route of administration, frequency of use, age at first use, and whether methadone was prescribed in treatment. Supplemental variables include: diagnosis codes, presence of psychiatric problems, living arrangements, source of income, health insurance, expected source of payment, pregnancy and veteran status, marital status, detailed not in labor force codes, detailed criminal justice referral codes, days waiting to enter treatment, and the number of arrests in the 30 days prior to admissions (starting in 2008) .
Substances abused include alcohol, cocaine and crack, marijuana and hashish, heroin, nonprescription methadone, other opiates and synthetics, PCP, other hallucinogens, methamphetamine, other amphetamines, other stimulants, benzodiazepines, other non-benzodiazepine tranquilizers, barbiturates, other non-barbiturate sedatives or hypnotics, inhalants, over-the-counter medications, and other substances.
Created variables include total number of substances reported, intravenous drug use (IDU), and flags for any mention of specific substances.
Treatment Episode Data Set -- Discharges (TEDS-D) -- Concatenated, 2006 to 2011 (ICPSR 30122)
The Treatment Episode Data Set -- Discharges (TEDS-D) is a national census data system of annual discharges from substance abuse treatment facilities. TEDS-D provides annual data on the number and characteristics of persons discharged from public and private substance abuse treatment programs that receive public funding. Data collected both at admission and at discharge is included. The unit of analysis is a treatment discharge. TEDS-D consists of data reported to state substance abuse agencies by the treatment programs, which in turn report it to SAMHSA.
A sister data system, called the Treatment Episode Data Set -- Admissions (TEDS-A), collects data on admissions to substance abuse treatment facilities. The first year of TEDS-A data is 1992, while the first year of TEDS-D is 2006.
TEDS-D variables that are required to be reported are called the "Minimum Data Set (MDS)", while those that are optional are called the "Supplemental Data Set (SuDS)".
Variables unique to TEDS-D, and not part of TEDS-A, are the length of stay, reason for leaving treatment, and service setting at time of discharge. TEDS-D also provides many of the same variables that exist in TEDS-A. This includes information on service setting, number of prior treatments, primary source of referral, gender, race, ethnicity, education, employment status, substance(s) abused, route of administration, frequency of use, age at first use, and whether methadone was prescribed in treatment. Supplemental variables include: diagnosis codes, presence of psychiatric problems, living arrangements, source of income, health insurance, expected source of payment, pregnancy and veteran status, marital status, detailed not in labor force codes, detailed criminal justice referral codes, days waiting to enter treatment, and the number of arrests in the 30 days prior to admissions (starting in 2008).
Substances abused include alcohol, cocaine and crack, marijuana and hashish, heroin, nonprescription methadone, other opiates and synthetics, PCP, other hallucinogens, methamphetamine, other amphetamines, other stimulants, benzodiazepines, other non-benzodiazepine tranquilizers, barbiturates, other non-barbiturate sedatives or hypnotics, inhalants, over-the-counter medications, and other substances.
Created variables include total number of substances reported, intravenous drug use (IDU), and flags for any mention of specific substances.
Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2000 (ICPSR 3270)
Process and Outcome Evaluation of the Residential Substance Abuse Treatment (RSAT) Program in Kyle, Texas, 1993-1995 (ICPSR 2765)
Characteristics of Arrestees at Risk for Co-Existing Substance Abuse and Mental Disorder in Cleveland, Ohio, 2003 (ICPSR 20352)
National Drug Abuse Treatment System Survey, Waves II-IV (ICPSR 4146)
Modeling Impacts of Policing Initiatives on Drug and Criminal Careers of Arrestees in New York City, New York, 1999 (ICPSR 3604)
Process Evaluation of a Residential Substance Abuse Treatment (RSAT) Program in Dallas County, Texas, 1998-1999 (ICPSR 3077)
Multi-Site Adult Drug Court Evaluation (MADCE), 2003-2009 (ICPSR 30983)
The Multi-Site Adult Drug Court Evaluation (MADCE) study included 23 drug courts and 6 comparison sites selected from 8 states across the country. The purpose of the study was to: (1) Test whether drug courts reduce drug use, crime, and multiple other problems associated with drug abuse, in comparision with similar offenders not exposed to drug courts, (2) address how drug courts work and for whom by isolating key individual and program factors that make drug courts more or less effective in achieving their desired outcomes, (3) explain how offender attitudes and behaviors change when they are exposed to drug courts and how these changes help explain the effectiveness of drug court programs, and (4) examine whether drug courts generate cost savings.
Offenders in all 29 sites were surveyed in 3 waves, at baseline, 6 months later, and 18 months after enrollment. The research comprises three major components: process evaluation, impact evaluation, and a cost-benefit analysis. The process evaluation describes how the 23 drug court sites vary in program eligibility, supervision, treatment, team collaboration, and other key policies and practices. The impact evaluation examines whether drug courts produce better outcomes than comparison sites and tests which court policies and offender attitudes might explain those effects. The cost-benefit analysis evaluates drug court costs and benefits.
Adoption of Innovations in Private Alcohol and Drug Treatment Centers in the United States [Restricted-Use], 2009-2013 (ICPSR 37621)
Adoption of Innovations in Private Alcohol and Drug Treatment Centers is a multi-wave longitudinal study conducted between 2009 and 2013. The study goal was to measure the adoption and implementation of evidence-based treatment practices in treatment centers that received more than 50 percent of their total operational funding from sources that were not guaranteed from year to year. This definition is based on the concept of entrepreneurship, namely the necessity for the treatment organization to respond to changing conditions in the external political and economic environment in order to obtain half or more of its funding. The innovations considered are of three types usually specific to organizations treating substance use disorders:
- medication-assisted treatments
- psychosocial treatments
- managerial practices
This data set consists of one of the multiple "waves" of data collection. The data was collected at four points in time. The baseline data, collected from June 2009 through October 2011 from 327 treatment centers, were obtained through face-to-face onsite interviews ranging from 1 to 4 hours in duration. These interviews were conducted with administrators of the respective treatment centers. In 70 of the 327 treatment centers, an administrator of the overall center and the administrator of clinical operations separately completed administrative and clinical interviews. In the remaining 257 centers, all of the administrative and clinical data were collected from the administrator of the overall center since there was no specialized administrator of clinical operations. The baseline data available here merge the data collected through these two different procedures so that the variables measured are identical for all centers regardless of the procedure.
The collected data include detailed information on Medication Assisted Treatment (MAT) and other treatment strategies used by the center to treat opioid use disorder (OUD) and alcohol use disorder (AUD). In cases where medications were not used by a center questions were asked for reasons why available medications were not used in treatment. Other sections of the interviews covered data on the organizations, their management, and other clinical practices implemented for OUD, AUD, and substance use disorder (SUD).
Three follow-up interviews were conducted via telephone at six month intervals following the previous interview. These follow-up interviews were much shorter compared to the baseline interview. The interviews centered on key changes in the center's operation and on the adoption of key innovations. But a focus of the follow-up interviews still focused on medications provided for treatment.
Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2002 (ICPSR 3815)
The Community Vulnerability and Responses to Drug-User-Related HIV/AIDS, 1990-2013 [96 Metropolitan Statistical Areas, United States] (ICPSR 36575)
The Community Vulnerability and Responses to Drug-User-Related HIV/AIDS, 1990-2013 [96 Metropolitan Statistical Areas, United States] study (CVAR) was a research study of why large United States Metropolitan Statistical Areas (MSAs) vary over time in their vulnerability to HIV/AIDS among drug users and in MSA responses to HIV/AIDS. This collection contains estimates of HIV prevalence among people who injected drugs (PWID) and among sub-populations of PWID. This collection is comprised of ten datasets with differing amounts of variables and provides trend data that describe the following:
- Epidemiologic outcomes including population prevalence of PWIDs and Non-injecting drug users (NIDUs), and particularly their prevalence among youth; and, among PWIDs, HIV prevalence, late-diagnosis HIV cases, and AIDS incidence and mortality.
- Implementation of evidence-based drug-related interventions including drug abuse treatment, syringe exchange, HIV counseling and testing.
- Implementation of non-evidence-based drug-related interventions including incarceration and arrests of drug users.
The collection contains data on the MSA sub-populations including Black, Hispanic, White and "other" race categories. In addition, some statistics are presented in age range categories such as ages 15-29, 30-64 and 15-64.
Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2003 (ICPSR 4020)
Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2001 (ICPSR 3688)
State Investments in Successful Transitions to Adulthood, 1970-2000 (ICPSR 34373)
This research investigated the relationship between ascribed characteristics, family resources, personal circumstances, and public policies as these affect the transition to adulthood. The transition to adulthood has been extensively studied during the last four decades using a variety of well-established approaches and methods. Changes in the structure and pace of youth-to-adult transitions have been extensively documented, along with the increasingly complex lives young people lead as they negotiate the transition to adulthood. Relatively less attention has been devoted to the factors leading to these changes, and a variety of public policies related to state economic development efforts, education, and financial support for higher education have yet to be examined in any detail. This project built on the principal investigators' prior work on life course transitions and state economic and political contexts to estimate behavioral models of the late 20th and early 21st century transition to adulthood.
Specifically, this research:
- Defines and describes the successful transition to adulthood in terms of human capital accumulation, attainment of economic security, and partnership and life satisfaction.
- Identifies group and individual disparities in successful transitions, defined by ascribed characteristics, family resources, and personal circumstances.
- Measures the impact of the social and economic environments where these transitions occur and the effects of state structures and policies on the successful transition to adulthood, specifically examining whether the impact of these state policies differs by race/ethnicity, immigrant status, and disability status.
The analysis used discrete hazard modeling and hierarchical generalized linear modeling (HGLM) to build a general model of the transition to adulthood on a wide variety of dimensions (from educational attainment to stable employment in a full-time job, employment in a job with health insurance, to independent residence and life satisfaction) and examined systematic changes in the process leading to adulthood across cohorts and across race/ethnic, immigrant, and disability groups.
Northwestern Juvenile Project, (Cook County, Illinois): Follow-up 1, 1998-2001 (ICPSR 34931)
This study contains data from the first follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. This initial follow-up occurred approximately three years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.
The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.
The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.
Dynamics of Retail Methamphetamine Markets in New York City, 2007-2009 (ICPSR 29821)
Northwestern Juvenile Project (Cook County, Illinois): Follow-up 2, 1999 - 2005 (ICPSR 36629)
This study contains data from the second follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. This second follow-up occurred approximately 3.5 years after the baseline interview and focused on the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risky behaviors.
The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Researchers studied changes in disorders over time (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. The NJP addressed the patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors are interrelated.
The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1,005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. A random subsample of 997 of the baseline participants were chosen for second follow-up interviews. Researchers tracked participants from the time they left detention and re-interviewed them regardless of where they were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.
The study was funded by OJJDP, several institutes at the National Institutes of Health, and other federal agencies and private foundations. The National Institutes of Health funded an additional component on HIV/AIDS risk behaviors.
Northwestern Juvenile Project (Cook County, Illinois), Follow-up 4, 2000-2006 (ICPSR 36686)
This study contains data from the fourth follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. The fourth follow-up occurred approximately 4.5 years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.
The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.
The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. All participants were eligible for fourth follow-up interviews. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.
Northwestern Juvenile Project (Cook County, Illinois), Follow-up 3, 1999-2007 (ICPSR 36651)
This study contains data from the third follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. The third follow-up occurred approximately four years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.
The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.
The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. A random subsample of 997 of the baseline participants were chosen for third follow-up interviews. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.
Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files (ICPSR 38008)
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). For Wave 1 (baseline), the study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco.
45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete the Youth Interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.
Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Dataset 0001 (DS0001) contains the data from the Public-Use File Master Linkage File (PUF-MLF). This file contains 93 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files.
Dataset 0002 (DS0002) contains the data from the Restricted-Use File Master Linkage File (RUF-MLF). This file contains 217 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.
Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Restricted-Use Files (ICPSR 37519)
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.
45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.
Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.
Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).
Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.
Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,617 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth"
Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.
There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.
Dataset 1402 (DS1402) contains the Wave 4.5 State Identifier data for Youth and Parents and has 5 variables and 13,131 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in this dataset represent participants' state of residence at the time of Wave 4.5.
Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,619 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,871 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5 and 5.
The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.
Dataset 2401 (DS2401) contains the Wave 5.5 State Identifier data for Adults and has 5 variables and 3,628 cases. Dataset 2402 (DS2402) contains the Wave 5.5 State Identifier data for Youth and Parents and has 5 variables and 7,129 cases. The same 5.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.5.
Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, and Wave 5.5 indicating if participants had ever/never used various tobacco products as of the Wave 5.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 977 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.
Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.
The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH_-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.
Dataset 3401 (DS3401) contains the PATH-ATS State Identifier data and has 5 variables and 8,874 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in this dataset represents participants' state of residence at the time of PATH-ATS.
Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 3,142 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 2,169 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.
There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.
Dataset 4401 (DS4401) contains the Wave 7.5 State Identifier data for Adults and has 5 variables and 7,961 cases. Dataset 4402 (DS4402) contains the Wave 7.5 State Identifier data for Youth and Parents and has 5 variables and 8,949 cases. The same 7.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.5.
Dataset 4503 (DS4503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, PATH-ATS, Wave 6, Wave 7, and Wave 7.5 indicating if participants had ever/never used various tobacco products as of the Wave 7.5 data collection period. This data file contains 25 variables for all 82,139 study participants as of the Wave 7.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 7.5. This data file contains 53 variables and 157 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7.5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.5.
Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Public-Use Files (ICPSR 37786)
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco.
45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.
Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.
Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).
Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.
Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,395 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.
There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.
Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,323 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,625 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, and 5.
The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.
Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 908 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.
Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.
The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.
Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, and PATH-ATS, indicating if participants had ever/never used various tobacco products as of the Wave 5.5/PATH-ATS data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5/PATH-ATS data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 2,760 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 1,889 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.
There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.
Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files (ICPSR 36231)
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.
45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.
Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study.
Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview.
Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases.
Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment.
Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 1.
Dataset 1801 (DS1801) contains Location Characteristics for Wave 1 Adults. This data file contains 4 variables and 32,320 cases.
Dataset 1802 (DS1802) contains Location Characteristics for Wave 1 Youth. This data file contains 4 variables and 13,651 cases.
Dataset 1901 (DS1901) contains Study Research Derived Variables for Wave 1 Adults created by PATH Study analysts. This data file contains 104 variables and 32,320 cases.
Dataset 1902 (DS1902) contains Study Research Derived Variables for Wave 1 Youth created by PATH Study analysts. This data file contains 89 variables and 13,651 cases.
Dataset 2011 (DS2011) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,421 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.
Dataset 2012 (DS2012) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,596 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."
Dataset 2411 (DS2411) contains the Wave 2 State Identifier data for Adults and has 5 variables and 28,362 cases. Dataset 2412 (DS2412) contains the Wave 2 State Identifier data for Youth and Parents and has 5 variables and 12,172 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 2.
Dataset 2611 (DS2611) contains the Tobacco Universal Product Code (UPC) data from Wave 2. This data file contains 32 variables and 7,295 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 2. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 2.
Dataset 2801 (DS2801) contains Location Characteristics for Wave 2 Adults. This data file contains 4 variables and 28,362 cases.
Dataset 2802 (DS2802) contains Location Characteristics for Wave 2 Youth. This data file contains 4 variables and 12,172 cases.
Dataset 2901 (DS2901) contains Study Research Derived Variables for Wave 2 Adults created by PATH Study analysts. This data file contains 178 variables and 28,362 cases.
Dataset 2902 (DS2902) contains Study Research Derived Variables for Wave 2 Youth created by PATH Study analysts. This data file contains 123 variables and 12,172 cases.
Dataset 3011 (DS3011) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,359 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 3012 (DS3012) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,492 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 3111, 3211, 3112, and 3212 (DS3111, DS3211, DS3112, and DS3212) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, starting with Wave 3, the weight variables have been separated into individual data files. The "all-waves" weight files contain weights for respondents who completed an interview for all waves in which they were old enough to do so or verified their information with the study for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.
Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 3411 (DS3411) contains the Wave 3 State Identifier data for Adults and has 5 variables and 28,148 cases. Dataset 3412 (DS3412) contains the Wave 3 State Identifier data for Youth and Parents and has 5 variables and 11,814 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 3.
Dataset 3611 (DS3611) contains the Tobacco Universal Product Code (UPC) data from Wave 3. This data file contains 32 variables and 6,768 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 3. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 3.
Dataset 3801 (DS3801) contains Location Characteristics for Wave 3 Adults. This data file contains 4 variables and 28,148 cases.
Dataset 3802 (DS3802) contains Location Characteristics for Wave 3 Youth. This data file contains 4 variables and 11,814 cases.
Dataset 3901 (DS3901) contains Study Research Derived Variables for Wave 3 Adults created by PATH Study analysts. This data file contains 107 variables and 28,148 cases.
Dataset 3902 (DS3902) contains Study Research Derived Variables for Wave 3 Youth created by PATH Study analysts. This data file contains 88 variables and 11,814 cases.
Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,504 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).
Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,600 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).
Datasets 4111, 4211, 4321, 4112, 4212, and 4322 (DS4111, DS4211, DS4321, DS4112, DS4212, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.
Dataset 4401 (DS4401) contains the Wave 4 State Identifier data for Adults and has 5 variables and 33,822 cases. Dataset 4402 (DS4402) contains the Wave 4 State Identifier data for Youth and Parents and has 5 variables and 14,798 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 4. For adults and youth from the replenishment sample, the values also represent state of residence at the time of recruitment.
Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires, indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 4. This data file contains 32 variables and 7,684 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 4. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 4.
Dataset 4801 (DS4801) contains Location Characteristics for Wave 4 Adults. This data file contains 4 variables and 33,822 cases.
Dataset 4802 (DS4802) contains Location Characteristics for Wave 4 Youth. This data file contains 4 variables and 14,798 cases.
Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,606 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire and 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,776 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview and 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.
There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.
Dataset 5401 (DS5401) contains the Wave 5 State Identifier data for Adults and has 5 variables and 34,309 cases. Dataset 5402 (DS5402) contains the Wave 5 State Identifier data for Youth and Parents and has 5 variables and 12,098 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.
Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 5601 (DS5601) contains the Tobacco Universal Product Code (UPC) data from Wave 5. This data file contains 33 variables and 6,678 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 5.
Dataset 5801 (DS5801) contains Location Characteristics for Wave 5 Adults. This data file contains 4 variables and 34,309 cases.
Dataset 5802 (DS5802) contains Location Characteristics for Wave 5 Youth. This data file contains 4 variables and 12,098 cases.
Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,935 variables and 30,516 cases
Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 2,080 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth Interview and 60 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS62221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.
There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.
Dataset 6401 (DS6401) contains the Wave 6 State Identifier data for Adults and has 5 variables and 30,516 cases. Dataset 6402 (DS6402) contains the Wave 6 State Identifier data for Youth and Parents and has 5 variables and 5,652 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 6.
Dataset 6503 (DS6503) contains data derived from responses to questionnaires in Waves 1-6 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 6601 (DS6601) contains the Tobacco Universal Product Code (UPC) data from Wave 6. This data file contains 53 variables and 5,408 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 6. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 6.
Dataset 6801 (DS6801) contains Location Characteristics for Wave 6 Adults. This data file contains 4 variables and 30,516 cases.
Dataset 6802 (DS6802) contains Location Characteristics for Wave 6 Youth. This data file contains 4 variables and 5,652 cases.
Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 3,221 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).
Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 2,171 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).
Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.
There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.
The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.
Dataset 7401 (DS7401) contains the Wave 7 State Identifier data for Adults and has 5 variables and 30,801 cases. Dataset 7402 (DS7402) contains the Wave 7 State Identifier data for Youth and Parents and has 5 variables and 10,834 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.
Dataset 7503 (DS7503) contains data derived from responses to questionnaires in Waves 1-7 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 7 data collection period. This data file contains 26 variables for all 82,139 study participants as of the Wave 7 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 7601 (DS7601) contains the Tobacco Universal Product Code (UPC) data from Wave 7. This data file contains 53 variables and 4,533 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.
Dataset 7801 (DS7801) contains Location Characteristics for Wave 7 Adults. This data file contains 4 variables and 30,801 cases.
Dataset 7802 (DS7802) contains Location Characteristics for Wave 7 Youth. This data file contains 4 variables and 10,834 cases.
Dataset 8001 (DS8001) contains the data from the Wave 8 Adult Questionnaire. This data file contains 3,467 variables and 31,477 cases. Of these cases, 30,021 are continuing adults having completed a prior Adult Questionnaire and 1,456 are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 8002 (DS8002) contains the data from the Wave 8 Youth and Parent Questionnaire. This data file contains 2,393 variables and 8,002 cases. Of these cases, 7,046 are continuing youth having completed a prior Youth Interview and 956 are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 8111, 8121, 8122, 8211, 8221, 8231, 8232, 8711, 8721, 8722, 8731, and 8732 (DS8111, DS8121, DS8122, DS8211, DS8221, DS8231, DS8232, DS8711, 8DS721, DS8722, DS8731, and DS8732) are data files comprising the weight variables for Wave 8. In Wave 8, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and 7. Note that only adults have "all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and 7.
There are three separate sets of files with "single-wave" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 8, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 8, regardless of their participation in the intervening waves. Note that only adults have "single-wave" weights for the Wave 1 and Wave 4 Cohorts; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8 and youth from the Wave 4 Cohort were selected as shadow youth so they do not have any interview data from Wave 4. The "single wave" weights files for the Wave 7 Cohort contain weights for participants who completed an interview in Wave 7 and in Wave 8.
There are also three separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, 7 and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. Note that only adults have "special collection all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, 7, and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. The "special collection all-waves" weight files for the Wave 7 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 7 and the special collection in Wave 7.5.
Dataset 8401 (DS8401) contains the Wave 8 State Identifier data for Adults and has 5 variables and 31,477 cases. Dataset 8402 (DS8402) contains the Wave 8 State Identifier data for Youth and Parents and has 5 variables and 8,002 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 8.
Dataset 8801 (DS8801) contains Location Characteristics for Wave 8 Adults. This data file contains 4 variables and 31,477 cases.
Dataset 8802 (DS8802) contains Location Characteristics for Wave 8 Youth. This data file contains 4 variables and 8,002 cases.
Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. In all waves of data collection, less than 0.5 percent of the parents did not complete an interview. Most questions are asked about the child.
When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).
Population Assessment of Tobacco and Health (PATH) Study [United States] Biomarker Restricted-Use Files (ICPSR 36840)
The Population Assessment of Tobacco and Health (PATH) Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study was launched in 2011 to inform the FDA's tobacco regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). For Wave 1 (baseline), the PATH Study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco, yielding interviews with 45,971 adult and youth respondents.
45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled PSUs and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort
Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Biospecimen Collection
Each adult respondent, who completed the interview at Wave 1, was asked to provide at least two biospecimens. Providing biospecimens was voluntary and was not a condition of participation. Respondents were asked to report their use of all nicotine-containing products during the 3-day period prior to the time of any biospecimen collection (Nicotine Exposure Questions (NEQs)) to facilitate interpretation of biomarker results.
Of the 32,320 respondents who completed the Adult Interview at Wave 1, 21,801 (67.4 percent) provided a urine specimen and 14,520 (44.9 percent) provided a blood specimen. For the purposes of subsampling adults into the Wave 1 Biomarker Core, adult participants were grouped by tobacco product use at Wave 1 into nine mutually exclusive groups.
A sample of 11,522 adults who provided sufficient urine for the planned analyses were selected from the first six tobacco product use groups (see section 3.1 of the Biomarker Restricted-Use Files User Guide) representing people who never used tobacco, currently use tobacco, and formerly used tobacco (within the last 12 months). This group constitutes the original Wave 1 Biomarker Core. Of the 11,522 adults, 7,159 also provided a blood specimen. All urine and blood specimens provided by the Wave 1 Biomarker Core were sent for laboratory analysis.
Subsequent to this selection, an additional stratified probability sample of adults who completed the Wave 1 Adult Interview and provided a sufficient amount of urine for the planned analyses at Wave 1 (independent of whether they provided a blood specimen) was selected from the remaining three product use groups (see section 3.1 of the Biomarker Restricted-Use Files User Guide). Wave 1 blood and urine specimens from this expansion sample were also sent for laboratory analysis. The original and expansion samples together form the expanded Wave 1 Biomarker Core. The expansion sample did not provide urine specimens for laboratory analysis again until Wave 7.
Each youth who completed the Wave 4 interview was asked to provide a urine specimen. Each Wave 4 shadow youth (ages 10 and 11 at Wave 4) who completed the Wave 5 youth interview was also asked to provide a urine specimen. Providing this urine biospecimen was voluntary and was not a condition of participation.
Of the 14,798 respondents who completed the Youth Interview at Wave 4, 13,097 (88.5 percent) provided a urine specimen. A sample of 3,509 Wave 4 Cohort youth ages 12 to 17 who completed the Wave 4 Youth Interview and provided a sufficient amount of urine for the planned laboratory analyses was selected from a diverse mix of five tobacco product use and non-use groups. In addition, a sample of 528 Wave 4 shadow youth who completed a Wave 5 interview and provided a sufficient amount of urine for the planned laboratory analyses at Wave 5 was also selected. These 4,037 sampled youth and shadow youth constitute the Wave 4 Biomarker Core. All urine specimens provided by the Wave 4 Biomarker Core were sent for laboratory analysis.
As members of the Wave 1 and Wave 4 Biomarker Cores age over time, a new Wave 7 Biomarker Core was designed to provide nationally representative estimates for the U.S. civilian noninstitutionalized adult (ages 18 and older) population (CNP) at the time of Wave 7 (2022-2023). To that end, Aat the conclusion of Wave 7, a new biomarker core was selected from Wave 7 Cohort adults who completed an interview and provided a urine specimen at Wave 7. The Wave 7 Biomarker Core sample selection was a two-stage process. Prior to the start of data collection, a subsample of continuing participants expected to be adults at the time of their Wave 7 interview, including some participants who were part of the Wave 1 or Wave 4 Biomarker Cores, was selected and flagged for urine collection; additionally, a subsample of replenishment sample address was selected and flagged so that any Wave 7 Adult Interview respondents living at the selected addresses would be asked to provide a urine specimen. Of the 10,698 Adult Interview respondents from these subsamples, 9,187 (85.9 percent) provided a urine specimen. A sample of 7,750 Wave 7 Cohort adults who completed the Wave 7 Adult Interview and provided a sufficient amount of urine for the planned laboratory analyses was selected from six mutually exclusive and exhaustive tobacco use groups (see section 3.3 of the Biomarker Restricted-Use Files User Guide). All urine specimens provided by the Wave 7 Biomarker Core were sent for laboratory analysis.
Biomarker Restricted Use Files
Wave 1 Restricted-Use Biomarker Data Files (Biomarker RUF) consists of three different types of files for the Wave 1 Biomarker Core:
- 2 Collection and NEQ files for Urine (DS1001) and Blood (DS1101)
- 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS1021) and Blood (DS1121). Both files are updated to include records for the expanded Wave 1 Biomarker Core.
- 8 Urine Panels (DS1031 to DS1038), 4 Serum Panels (DS1131 to DS1134) and 1 Plasma Panel (DS1231) containing biomarker assay results. 6 Urine Panels (DS1032, DS1033, DS1035, DS1036, DS1037, and DS1038) and 2 Serum Panels (DS1131 and DS1132) are updated to include records for the expanded Wave 1 Biomarker Core.
All files updated to include records for the expanded Wave 1 Biomarker Core contain an indicator R01_A_W1BC_TYPE (1 = Original, 2 = Expansion) to identify respondents in the Wave 1 Biomarker Core original and expansion subsamples.
For Wave 2, urine biospecimens were requested from the original Wave 1 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.
The Wave 2 Biomarker RUF consists of three different types of files:
- 1 Collection and NEQ file for Urine (DS2001)
- 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS2021) and F2PG2a (DS2022)
- 8 Urine Panels (DS2031 to DS2038) containing biomarker assay results.
For Wave 3, urine biospecimens were requested from the original Wave 1 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.
The Wave 3 Biomarker RUF consists of three different types of files:
- 1 Collection and NEQ file for Urine (DS3001)
- 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS3021 and DS3022) and F2PG2a (DS3023 and DS3024).
- 7 Urine Panels (DS3032 to DS3038) containing biomarker assay results.
For Wave 4, urine biospecimens were requested from the original Wave 1 Biomarker Core and all youth who completed the Wave 4 interview. Respondents were also asked to complete the NEQs prior to biospecimen collection.
The Wave 4 Biomarker RUF consists of the following files for each Biomarker Core:
Wave 1 Biomarker Core:
- 1 Collection and NEQ file for Urine (DS4001)
- 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS4021 and DS4022) and F2PG2a (DS4023 and DS4024).
- 7 Urine Panels (DS4032, DS4033, DS4034, DS4035, DS4036, DS4037 and DS4038) containing biomarker assay results.
Wave 4 Biomarker Core:
- 1 Collection and NEQ file for Youth Urine (DS4011)
- 1 Biomarker Weight files including variables for use in variance estimation for Urine (DS4043)
- 7 Urine Panels (DS4051, DS4053, DS4054, DS4055, DS4056, DS4057 and DS4058) containing biomarker assay results.
For Wave 5, urine biospecimens were requested from the original Wave 1 Biomarker Core and the Wave 4 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.
The Wave 5 Biomarker RUF consists of the following files for each Biomarker Core:
Wave 1 Biomarker Core:
- 1 Collection and NEQ file for Urine (DS5001)
- 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS5021 and DS5022) and F2PG2a (DS5023 and DS5024)
- 6 Urine Panels (DS5032, DS5033, DS5035, DS5036, DS5037, and DS5038) containing biomarker assay results.
Wave 4 Biomarker Core:
- 1 Collection and NEQ file for Youth Urine (DS5011)
- 1 Collection and NEQ file for Adult Urine (DS5001)
- 1 Biomarker Weight file including variables for use in variance estimation for Urine (DS5042)
- 7 Urine Panels (DS5051, DS5053, DS5054, DS5055, DS5056, DS5057, and DS5058) containing biomarker assay results.
Note that the initial release of 3 Urine Panels and Biomarker weights for the Wave 4 Biomarker Core only included records for those among the 3,509 members who responded in Wave 5 and provided urine specimens in sufficient quantities for laboratory analyses. As of version 20, the Wave 5 biomarker data files and weights include data for all Wave 4 Biomarker Core members who provided urine specimens at Wave 5 in sufficient quantities for laboratory analyses, including the Wave 4 shadow youth who completed their first interviews at Wave 5. This means that records were added to previously released urine panel data files (DS5051, DS5053, and DS5056) and biomarker weights (DS5042) to include data for the Wave 4 shadow youth (N=528) who completed their first interviews at Wave 5. All panels released in version 20 and beyond will include records for the complete Wave 4 Biomarker Core.
Also note that the Collection and NEQ file for Adult Urine (DS5001) includes data for both the Wave 1 Biomarker Core and Wave 4 Biomarker Core.
For Wave 7, urine biospecimens were requested from the Wave 1 Biomarker Core, the Wave 4 Biomarker Core, and those in the subsample eligible for the Wave 7 biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.
The Wave 7 Biomarker RUF consists of the following files for each Biomarker Core:
Wave 1 Biomarker Core:
- 1 Collection and NEQ file for Urine (DS7001)
- 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS7021 and DS7022) and F2PG2a (DS7023 and DS7024)
- 6 Urine Panels (DS7032, DS7033, DS7035, DS7036, DS7037, and DS7038) containing biomarker assay results.
Wave 4 Biomarker Core:
- 1 Collection and NEQ file for Youth Urine (DS7011)
- 1 Collection and NEQ file for Adult Urine (DS7001)
- 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS7041 and DS7042)
- 6 Urine Panels (DS7051, DS7053, DS7055, DS7056, DS7057, and DS7058) containing biomarker assay results.
Wave 7 Biomarker Core:
- 1 Collection and NEQ file for Urine (DS7001)
- 1 Biomarker Weight file including variables for use in variance estimation for Urine (DS7061)
- 4 Urine Panels (DS7072, DS7073, DS7076, DS7077) containing biomarker assay results.
The Collection and NEQ file for Adult Urine (DS7001) includes data for the Wave 1 Biomarker Core, Wave 4 Biomarker Core, and Wave 7 Biomarker Core.
Please refer to the Biomarker Restricted-Use Files User Guide for additional information about the Biomarker Cores.
References to the collection of biospecimens will be specified by the collected specimen, i.e., urine and (whole) blood. However, references to biomarker analyses and analytes will be specified by the type of matrix (serum, plasma, or urine) used for the analysis.
Population Assessment of Tobacco and Health (PATH) Study [United States] Public-Use Files (ICPSR 36498)
The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.
45,971 adults and youth constitute the first (baseline) wave of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.
At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.
Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 14 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave.
At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.
Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.
Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview.
Dataset 1002 (DS1002) contains the data from the Youth and Parent Questionnaire. This file contains 1,228 variables and 13,651 cases.
Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.
Dataset 2002 (DS2002) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."
Dataset 3001 (DS3001) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,139 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 3002 (DS3002) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,309 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 3101, 3102, 3201 and 3202 (DS3101, DS3102, DS3201, and DS3202) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, in Wave 3, the weight variables have been separated into individual data files for Adult and Youth Questionnaires. The "all-waves" weight files contain weights for those respondents who have completed an interview during all three waves of data collection. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.
Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,182 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).
Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,389 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).
Datasets 4111, 4112, 4211, 4212, 4321, and 4322 (DS4111, DS4112, DS4211, DS4212, DS4321, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.
Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,315 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire, 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,530 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview, 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.
Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contains weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.
Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,589 variables and 30,516 cases. Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.
Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 1,822 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth interview and 30 cases are "aged-up youth" having previously been sampled as "shadow youth."
Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS6221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.
There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.
Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 2,813 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).
Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 1,897 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).
Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.
There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.
There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.
There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.
The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.
Dataset 6503 (DS6503) contains data derived from responses to Wave 1-6 (including Wave 4.5, Wave 5.5, and PATH-ATS) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.
Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. Across all waves of data collection, an average of 0.6 percent of the parents did not complete an interview. Most questions are asked about the child.
When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).