Search results

Showing 1 – 28 of 28 results.
Curated

Comprehensive Investigation of the Role of Individuals, the Immediate Social Environment, and Neighborhoods in Trajectories of Adolescent Antisocial Behavior in Chicago, Illinois, 1994-2002 (ICPSR 33921)

Released/updated on: 2012-12-19
Geographic coverage: United States, Chicago, Illinois
Time period: 1994-01-01--2002-01-01
The overall goal of this study was to acquire a greater understanding of the development of adolescent antisocial behavior using data from the Project on Human Development in Chicago Neighborhoods (PHDCN). Longitudinal cohort data from PHDCN were analyzed to assess patterns of substance use and delinquency across three waves for three age cohorts and 78 neighborhoods. This analysis of existing PHDCN data used multiple cohort and multilevel latent growth models as well as several ancillary approaches to answer questions pertinent to the development of adolescent antisocial behavior.
Curated

Monitoring Drug Epidemics and the Markets That Sustain Them, Arrestee Drug Abuse Monitoring (ADAM) and ADAM II Data, 2000-2003 and 2007-2010 (ICPSR 33201)

Released/updated on: 2012-12-13
Geographic coverage: North Carolina, Oregon, District of Columbia, Charlotte, Sacramento, Indiana, United States, Chicago, Minnesota, California, New York (state), New York City, Minneapolis, Atlanta, Illinois, Colorado, Portland (Oregon), Denver, Georgia, Indianapolis
Time period: 2000-01-01--2003-01-01, 2007-01-01--2010-01-01
This study examined trends in the use of five widely abused drugs among arrestees at 10 geographically diverse locations from 2000 to 2010: Atlanta, Charlotte, Chicago, Denver, Indianapolis, Manhattan, Minneapolis, Portland Oregon, Sacramento, and Washington DC. The data came from the Arrestee Drug Abuse Monitoring Program reintroduced in 2007 (ADAM II) and its predecessor the ADAM program. ADAM data included urinalysis results that provided an objective measure of recent drug use, provided location specific estimates over time, and provided sample weights that yielded unbiased estimates for each location. The ADAM data were analyzed according to a drug epidemics framework, which has been previously employed to understand the decline of the crack epidemic, the growth of marijuana use in the 1990s, and the persistence of heroin use. Similar to other diffusion of innovation processes, drug epidemics tend to follow a natural course passing through four distinct phases: incubation, expansion, plateau, and decline. The study also searched for changes in drug markets over the course of a drug epidemic.
The following results may be significantly less relevant compared to results above.
Curated

Treatment Episode Data Set -- Admissions (TEDS-A) -- Concatenated, 1992 to 2012 (ICPSR 25221)

Released/updated on: 2015-11-23
Geographic coverage: United States
Time period: 1992-01-01--2012-01-01

The Treatment Episode Data Set -- Admissions (TEDS-A) is a national census data system of annual admissions to substance abuse treatment facilities. TEDS-A provides annual data on the number and characteristics of persons admitted to public and private substance abuse treatment programs that receive public funding. The unit of analysis is a treatment admission. TEDS consists of data reported to state substance abuse agencies by the treatment programs, which in turn report it to SAMHSA.

A sister data system, called the Treatment Episode Data Set -- Discharges (TEDS-D), collects data on discharges from substance abuse treatment facilities. The first year of TEDS-A data is 1992, while the first year of TEDS-D is 2006.

TEDS variables that are required to be reported are called the "Minimum Data Set (MDS)", while those that are optional are called the "Supplemental Data Set (SuDS)".

Variables in the MDS include: information on service setting, number of prior treatments, primary source of referral, gender, race, ethnicity, education, employment status, substance(s) abused, route of administration, frequency of use, age at first use, and whether methadone was prescribed in treatment. Supplemental variables include: diagnosis codes, presence of psychiatric problems, living arrangements, source of income, health insurance, expected source of payment, pregnancy and veteran status, marital status, detailed not in labor force codes, detailed criminal justice referral codes, days waiting to enter treatment, and the number of arrests in the 30 days prior to admissions (starting in 2008) .

Substances abused include alcohol, cocaine and crack, marijuana and hashish, heroin, nonprescription methadone, other opiates and synthetics, PCP, other hallucinogens, methamphetamine, other amphetamines, other stimulants, benzodiazepines, other non-benzodiazepine tranquilizers, barbiturates, other non-barbiturate sedatives or hypnotics, inhalants, over-the-counter medications, and other substances.

Created variables include total number of substances reported, intravenous drug use (IDU), and flags for any mention of specific substances.

Curated

Treatment Episode Data Set -- Discharges (TEDS-D) -- Concatenated, 2006 to 2011 (ICPSR 30122)

Released/updated on: 2015-11-23
Geographic coverage: United States
Time period: 2006-01-01--2011-01-01

The Treatment Episode Data Set -- Discharges (TEDS-D) is a national census data system of annual discharges from substance abuse treatment facilities. TEDS-D provides annual data on the number and characteristics of persons discharged from public and private substance abuse treatment programs that receive public funding. Data collected both at admission and at discharge is included. The unit of analysis is a treatment discharge. TEDS-D consists of data reported to state substance abuse agencies by the treatment programs, which in turn report it to SAMHSA.

A sister data system, called the Treatment Episode Data Set -- Admissions (TEDS-A), collects data on admissions to substance abuse treatment facilities. The first year of TEDS-A data is 1992, while the first year of TEDS-D is 2006.

TEDS-D variables that are required to be reported are called the "Minimum Data Set (MDS)", while those that are optional are called the "Supplemental Data Set (SuDS)".

Variables unique to TEDS-D, and not part of TEDS-A, are the length of stay, reason for leaving treatment, and service setting at time of discharge. TEDS-D also provides many of the same variables that exist in TEDS-A. This includes information on service setting, number of prior treatments, primary source of referral, gender, race, ethnicity, education, employment status, substance(s) abused, route of administration, frequency of use, age at first use, and whether methadone was prescribed in treatment. Supplemental variables include: diagnosis codes, presence of psychiatric problems, living arrangements, source of income, health insurance, expected source of payment, pregnancy and veteran status, marital status, detailed not in labor force codes, detailed criminal justice referral codes, days waiting to enter treatment, and the number of arrests in the 30 days prior to admissions (starting in 2008).

Substances abused include alcohol, cocaine and crack, marijuana and hashish, heroin, nonprescription methadone, other opiates and synthetics, PCP, other hallucinogens, methamphetamine, other amphetamines, other stimulants, benzodiazepines, other non-benzodiazepine tranquilizers, barbiturates, other non-barbiturate sedatives or hypnotics, inhalants, over-the-counter medications, and other substances.

Created variables include total number of substances reported, intravenous drug use (IDU), and flags for any mention of specific substances.

Curated
Restricted

Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2000 (ICPSR 3270)

Released/updated on: 2006-03-30
Geographic coverage: United States
Time period: 2000-01-01--2000-12-31
Beginning in 1996, the National Institute of Justice (NIJ) initiated a major redesign of its multisite drug-monitoring program, the Drug Use Forecasting (DUF) system (DRUG USE FORECASTING IN 24 CITIES IN THE UNITED STATES, 1987-1997 [ICPSR 9477]). The program was retitled Arrestee Drug Abuse Monitoring (ADAM) (see ARRESTEE DRUG ABUSE MONITORING (ADAM) PROGRAM IN THE UNITED STATES, 1998 [ICPSR 2628] and 1999 [ICPSR 2994]). ADAM extended DUF in the number of sites and improved the quality and generalizability of the data. The redesign was fully implemented in all sites beginning in the first quarter of 2000. The ADAM program implemented a new and expanded adult instrument in the first quarter of 2000, which was used for both the male (Part 1) and female (Part 2) data. The juvenile data for 2000 (Part 3) used the juvenile instrument from previous years. The ADAM program also moved to probability-based sampling for the adult male population during 2000. Therefore, the 2000 adult male sample includes weights, generated through post-sampling stratification of the data. The shift to sampling of the adult male population in 2000 required that all 35 sites move to a common catchment area, the county. The core instrument for the adult cases was supplemented by a facesheet, which was used to collect demographic and charge information from official records. Core instruments were used to collect self-report information from the respondent. Both the adult and juvenile instruments were administered to persons arrested and booked on local or state charges relevant to the jurisdiction (i.e., not federal or out-of-county charges) within the past 48 hours. At the completion of the interview the arrestee was asked to voluntarily provide a urine specimen. An external lab used the Enzyme Multiplied Immunoassay Testing (EMIT) protocols to test for the presence of ten drugs or metabolites of the drug in the urine sample. All amphetamine positives were confirmed by gas chromatography/mass spectrometry (GC/MS) to determine whether methamphetamine was used. For the adult data, variables from the facesheet include arrest precinct, ZIP code of arrest location, ZIP code of respondent's address, respondent's gender and race, three most serious arrest charges, sample source (stock, flow, other), interview status (including reason the individual selected in the sample was not interviewed), language of instrument used, and the number of hours since arrest. Demographic information from the core instrument includes respondent's age, ethnicity, residency, education, employment, health insurance coverage, marital status, housing, and telephone access. Variables from the calendar provide information on inpatient and outpatient substance abuse treatment, inpatient mental health treatment, arrests and incarcerations, heavy alcohol use, use of marijuana, crack/rock cocaine, powder cocaine, heroin, methamphetamine, and other drug (ever and previous 12 months), age of first use of the above six drugs and heavy alcohol use, drug dependency in the previous 12 months, characteristics of drug transactions in past 30 days, use of marijuana, crack/rock cocaine, powder cocaine, heroin, and methamphetamine in past 30 days, 7 days, and 48 hours, heavy alcohol use in past 30 days, and secondary drug use of 15 other drugs in the past 48 hours. Urine test results are provided for 11 drugs -- marijuana, cocaine, opiates, phencyclidine (PCP), benzodiazepines (Valium), propoxyphene (Darvon), methadone, methaqualone, barbiturates, amphetamines, and methamphetamine. The adult data files include several derived variables. The male data also include four sampling weights, and stratum identifications and percents. For the juvenile data, demographic variables include age, race, sex, educational attainment, employment status, and living circumstances. Data also include each juvenile arrestee's self-reported use of 15 drugs (alcohol, tobacco, marijuana, powder cocaine, crack, heroin, PCP, amphetamines, barbiturates, quaaludes, methadone, crystal methamphetamine, Valium, LSD, and inhalants). For each drug type, arrestees reported whether they had ever used the drug, age of first use, whether they had used the drug in the past 30 days and past 72 hours, number of days they used the drug in past month, whether they tried to cut down or quit using the drug, if they were successful, whether they felt dependent on the drug, whether they were receiving treatment for the drug, whether they had received treatment for the drug in the past, and whether they thought they could use treatment for that drug. Additional variables include whether juvenile respondents had ever injected drugs, whether they were influenced by drugs when they allegedly committed the crime for which they were arrested, whether they had been to an emergency room for drug-related incidents, and if so, whether in the past 12 months, and information on arrests and charges in the past 12 months. As with the adult data, urine test results are also provided. Finally, variables covering precinct (precinct of arrest) and law (penal law code associated with the crime for which the juvenile was arrested) are also provided for use by local law enforcement officials at each site.
Curated
Restricted

Process and Outcome Evaluation of the Residential Substance Abuse Treatment (RSAT) Program in Kyle, Texas, 1993-1995 (ICPSR 2765)

Released/updated on: 2006-03-30
Geographic coverage: United States, Texas
Time period: 1993-01-01--1995-01-01
This study was undertaken to evaluate the treatment process and outcomes associated with a Residential Substance Abuse Treatment (RSAT) In-Prison Therapeutic Community (ITC) component of the 1991 Texas Criminal Justice Chemical Dependency Treatment Initiative, as well as to assess the effectiveness of prison-based drug treatment. Specifically, this study evaluated the RSAT ITC treatment process and outcomes in Kyle, Texas, using the prison-based treatment assessment (PTA) data systems. The study design included process and outcome evaluations using a sample of graduates from the first ITC treatment facility (Kyle cohort) and a matched comparison group of prison inmates who were eligible, but not selected, for assignment to an ITC. Data collection occurred at three points in time -- at the end of treatment in the Kyle ITC, and at six months and one year following an offender's release from the ITC program. Variables in the 19 files for this study include: Part 1 (Educational Demographic Data, Kyle Cohort): Highest grade level achieved by respondent, Texas Department of Criminal Justice education achievement and IQ scores, and the number of days at the Kyle ITC program. Parts 2-4 (Treatment Background Data, Kyle Cohort, Aftercare Treatment Data, Kyle Cohort, Treatment Condition Data, Kyle Cohort): Treatment condition, discharge codes, and whether there were three months of residential aftercare. Part 5 (Session One Interview Data, Kyle Cohort): Gender, ethnicity, age, marital status, whether the respondent was given medication, followed directions, made friends, or got into trouble while in elementary school, whether he held a job prior to prison, if either of his parents spent time with, yelled at, or sexually abused him, whether he used drugs, if so, specific drugs used (e.g., alcohol, inhalants, marijuana, or crack), and whether he did jail time. Part 6 (Session Two Interview Data, Kyle Cohort): Whether drugs kept the respondent from working, caused emotional problems, or caused medical problems, if people were important to the respondent, if he had trouble staying focused, felt sad or depressed, satisfied with life, lonely, nervous, or got mad easily, whether he felt the staff was caring and helpful, whether he showed concern for the group and accepted confrontation by the group, whether the respondent felt the counselor was easy to talk to, respected him, or taught him problem-solving, and whether the respondent viewed himself as thinking clearly, clearly expressing thoughts, and was interested in treatment. Part 7 (Session Three Interview Data, Kyle Cohort): How the respondent saw himself as a child, whether he was easily distracted, anxious, nervous, inattentive, short-tempered, stubborn, depressed, rebellious, irritable, moody, angry, or impulsive, whether the respondent had trouble with school, was considered normal by friends, ever lost a job or friends due to drinking or drug abuse, or was ever arrested or hospitalized for drug or alcohol abuse, and in the last week whether the respondent's mood was one of sadness, satisfaction, disappointment, irritation, or suicide. Parts 8 and 9 (Six-Month Follow-Up Interview Data, Kyle Cohort, and One-Year Follow-Up Interview Data, Kyle Cohort): Organization of meetings and activities in the program, rules and regulations, work assignments, privileges, individual counseling, the care and helpfulness of the treatment staff and custody staff, the respondent's behavior, mood, living situation, drug use, and arrests within the last six months, whether the counselor was easy to talk to, helped in motivating or building confidence, or assisted in making a treatment plan, whether the respondent felt a sense of family or closeness, if his family got along, enjoyed being together, got drunk together, used drugs together, or had arguments or fights, if the respondent had a job in the last six months to a year and if he enjoyed working, whether he was on time for his job, whether he had new friends or associated with old friends, and which specific drugs he had used in the last six months (e.g., hallucinogens, heroin, methadone, or other opiates). Part 10 (Treatment Background Data, Comparison Group): Treatment condition of the comparison group. Part 11 (Educational Demographic Data, Comparison Group): Whether respondents completed a GED and their highest grade completed. Parts 12 and 13 (Six-Month Follow-Up Interview Data, Comparison Group, and One-Year Follow-Up Interview Data, Comparison Group): How important church was to the respondent, whether the respondent had any educational or vocational training, if he had friends that had used drugs, got drunk, dealt drugs, or had been arrested, if within the last six months to a year the respondent had been arrested for drug use, drug sales, forgery, fencing, gambling, burglary, robbery, sexual offense, arson, or vandalism, whether drugs or alcohol affected the respondent's health, relations, attitude, attention, or ability to work, whether the respondent experienced symptoms of withdrawal, the number of drug treatment programs and AA or CA meetings the respondent attended, whether the respondent received help from parents, siblings, or other relatives, if treatment was considered helpful, and risky behavior engaged in (e.g., sharing needles, using dirty needles, and unprotected sex). Parts 14 and 16 (Probation Officer Data, Six-Month Follow-Up Interview, Kyle Cohort and Comparison Group, and Probation Officer Data, One-Year Follow-Up Interview, Kyle Cohort and Comparison Group): Date of departure from prison, supervision level, number of treatment team meetings, whether there was evidence of job hunting, problems with transportation, child care, or finding work, number of drug tests in the last six months, times tested positive for marijuana, cocaine, heroin, opiates, crack, or other drugs, and number of arrests, charges, convictions, and technicals. Parts 15 and 17 (Hair Specimen Data, Six-Month Follow-Up Interview, Kyle Cohort and Comparison Group, and Hair Specimen Data, One-Year Follow-Up Interview, Kyle Cohort and Comparison Group): Hair collection and its source at the six-month follow-up (Part 15) and one-year follow-up (Part 17) and whether parolee was positive or negative for cocaine or opiates. Part 18 (Texas Department of Public Safety Data, Kyle Cohort and Comparison Group): Dates of first, second, and third offenses, if parolee was arrested, and first, second, and third offenses from the National Crime Information Center. Part 19 (Texas Department of Criminal Justice Data, Kyle Cohort and Comparison Group): Treatment condition, date of release, race, and a Texas Department of Criminal Justice Salient Factor Risk Score.
Curated
Restricted

Characteristics of Arrestees at Risk for Co-Existing Substance Abuse and Mental Disorder in Cleveland, Ohio, 2003 (ICPSR 20352)

Released/updated on: 2009-02-25
Geographic coverage: United States, Ohio, Cleveland
Time period: 2003-04-01--2003-06-01
The current study was conducted as a supplemental study to the Cleveland/Cuyahoga County Arrestee Drug Abuse Monitoring (ADAM) program in the second quarter of 2003 (April-June). A risk screening instrument was implemented to classify Cleveland/Cuyahoga County adult arrestees into four groups: arrestees at no risk for substance abuse or dependence or mental disorder; arrestees at risk for substance abuse or dependence with no risk for mental disorder; arrestees at risk for mental disorder with no risk for substance abuse or dependence; and arrestees at risk for both mental disorder and substance abuse or dependence. A total of 311 adult arrestees were interviewed and provided a urine sample submitted for testing. The dual risk screening instrument includes six mental disorder risk questions and six substance abuse risk questions. The mental disorder risk questions include questions on having feelings or emotions that make it difficult to complete normal day to day activities, feeling hopeless or depressed, having thoughts of hurting oneself or committing suicide, and hearing or seeing things that others cannot hear or see. The substance abuse risk questions include questions on problems caused by drinking or drug use, arrests due to alcohol or drug use, time spent on thinking about or trying to get alcohol or drugs, and feelings of guilt about drinking or drug use.
Curated

National Drug Abuse Treatment System Survey, Waves II-IV (ICPSR 4146)

Released/updated on: 2009-07-30
Geographic coverage: United States
The National Drug Abuse Treatment System Survey (NDATSS) is a longitudinal program of research into organizational structures, operating characteristics, and treatment modalities of outpatient drug treatment programs in the United States. This is done through interviews with program directors and clinical supervisors. In some publications, this research is referred to as the Outpatient Drug Abuse Treatment Studies (ODATS). Data being released include Wave II (1988), Wave III (1990), and Wave IV (1995).
Curated
Restricted

Modeling Impacts of Policing Initiatives on Drug and Criminal Careers of Arrestees in New York City, New York, 1999 (ICPSR 3604)

Released/updated on: 2006-03-30
Geographic coverage: New York City, United States, New York (state)
This study sought to understand the accuracy and validity of arrestee self-reports of drug use and the overall contact of arrestees with the criminal justice system, with a secondary focus on how arrestee self-reports of drug use correspond to urinalysis results. Moreover, this study investigated whether arrestees were aware of the New York City Police Department's Quality-of-Life (QOL) policing efforts and whether they had changed their criminal behavior as a result. A QOL Policing Supplement, designed to explore new means of evaluating police behavior, was administered to all adult arrestees in the five boroughs of New York City (Bronx, Brooklyn, Manhattan, Staten Island, and Queens) who had completed an Arrestee Drug Abuse Monitoring (ADAM) program interview, provided a urine specimen, and were willing to answer additional questions concerning QOL policing. Part 1, Policing Study Data, is a large integrated dataset containing all of the variables derived from the 1999 ADAM interviews, the Policing Supplement instrument, and administrative records data from the Criminal Justice Agency (CJA) and the New York State Division of Criminal Justice Services. This dataset is linked, via an anonymous case number, to Part 2, Arrestee Criminal History Data, which contains each arrestee's official criminal history.
Curated
Restricted

Process Evaluation of a Residential Substance Abuse Treatment (RSAT) Program in Dallas County, Texas, 1998-1999 (ICPSR 3077)

Released/updated on: 2003-06-05
Geographic coverage: United States, Texas
Time period: 1998-01-01--1999-06-01
This study assessed the Dallas County Judicial Treatment Center (DCJTC) in Texas. The DCJTC is a residential substance abuse treatment center for drug-involved felony offenders. It provides a treatment program of approximately six months in three major phases: orientation, main treatment, and re-entry. Data were collected from 429 offenders admitted to the DCJTC between January and December 1998. During their first week of treatment, residents completed a comprehensive intake battery that included (1) the Texas Christian University (TCU) initial assessment, (2) the TCU self-rating form (SRF), and (3) the TCU intake interview. The initial assessment gauged mental status, background and psychosocial functioning, alcohol and other drug use, and psychological status. The SRF assessed psychological functioning, social functioning, and motivation for treatment. The intake interview included detailed questions on the resident's social background, family and peer relations, health and psychological status, criminal history, drug use problems, and behavioral risks for HIV/AIDS. Progress made during treatment was measured by the TCU Resident Evaluation of Self and Treatment (REST) and the TCU Counselor Rating of Client (CRC) forms. The REST included all questions on the SRF, plus questions on offenders' perceptions of the structure of the program and their experiences while in treatment, an evaluation of the counselor, an evaluation of their own personality, and ratings of group and individual treatment sessions. The CRC forms rated residents on a set of attributes related to residents' ability to benefit from treatment and indicated the extent to which counseling activities with each client had focused on certain activities.
Curated
Restricted

Multi-Site Adult Drug Court Evaluation (MADCE), 2003-2009 (ICPSR 30983)

Released/updated on: 2012-11-05
Geographic coverage: North Carolina, New York, United States, Illinois, Georgia, Florida, Washington, South Carolina, Pennsylvania
Time period: 2004-02-01--2004-06-01, 2005-03-01--2006-06-01, 2005-08-01--2006-12-01, 2006-09-01--2008-01-01, 2006-09-01--2008-01-01

The Multi-Site Adult Drug Court Evaluation (MADCE) study included 23 drug courts and 6 comparison sites selected from 8 states across the country. The purpose of the study was to: (1) Test whether drug courts reduce drug use, crime, and multiple other problems associated with drug abuse, in comparision with similar offenders not exposed to drug courts, (2) address how drug courts work and for whom by isolating key individual and program factors that make drug courts more or less effective in achieving their desired outcomes, (3) explain how offender attitudes and behaviors change when they are exposed to drug courts and how these changes help explain the effectiveness of drug court programs, and (4) examine whether drug courts generate cost savings.

Offenders in all 29 sites were surveyed in 3 waves, at baseline, 6 months later, and 18 months after enrollment. The research comprises three major components: process evaluation, impact evaluation, and a cost-benefit analysis. The process evaluation describes how the 23 drug court sites vary in program eligibility, supervision, treatment, team collaboration, and other key policies and practices. The impact evaluation examines whether drug courts produce better outcomes than comparison sites and tests which court policies and offender attitudes might explain those effects. The cost-benefit analysis evaluates drug court costs and benefits.

Curated
Restricted

Adoption of Innovations in Private Alcohol and Drug Treatment Centers in the United States [Restricted-Use], 2009-2013 (ICPSR 37621)

Released/updated on: 2020-08-12
Geographic coverage: United States
Time period: 2009-01-01--2013-01-01

Adoption of Innovations in Private Alcohol and Drug Treatment Centers is a multi-wave longitudinal study conducted between 2009 and 2013. The study goal was to measure the adoption and implementation of evidence-based treatment practices in treatment centers that received more than 50 percent of their total operational funding from sources that were not guaranteed from year to year. This definition is based on the concept of entrepreneurship, namely the necessity for the treatment organization to respond to changing conditions in the external political and economic environment in order to obtain half or more of its funding. The innovations considered are of three types usually specific to organizations treating substance use disorders:

  • medication-assisted treatments
  • psychosocial treatments
  • managerial practices

This data set consists of one of the multiple "waves" of data collection. The data was collected at four points in time. The baseline data, collected from June 2009 through October 2011 from 327 treatment centers, were obtained through face-to-face onsite interviews ranging from 1 to 4 hours in duration. These interviews were conducted with administrators of the respective treatment centers. In 70 of the 327 treatment centers, an administrator of the overall center and the administrator of clinical operations separately completed administrative and clinical interviews. In the remaining 257 centers, all of the administrative and clinical data were collected from the administrator of the overall center since there was no specialized administrator of clinical operations. The baseline data available here merge the data collected through these two different procedures so that the variables measured are identical for all centers regardless of the procedure.

The collected data include detailed information on Medication Assisted Treatment (MAT) and other treatment strategies used by the center to treat opioid use disorder (OUD) and alcohol use disorder (AUD). In cases where medications were not used by a center questions were asked for reasons why available medications were not used in treatment. Other sections of the interviews covered data on the organizations, their management, and other clinical practices implemented for OUD, AUD, and substance use disorder (SUD).

Three follow-up interviews were conducted via telephone at six month intervals following the previous interview. These follow-up interviews were much shorter compared to the baseline interview. The interviews centered on key changes in the center's operation and on the adoption of key innovations. But a focus of the follow-up interviews still focused on medications provided for treatment.

Curated
Restricted

Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2002 (ICPSR 3815)

Released/updated on: 2006-03-30
Geographic coverage: North Carolina, Oklahoma City, Charlotte, Indiana, Tucson, Albuquerque, Spokane, Utah, San Jose, New York City, San Diego, Arizona, Las Vegas, Sacramento, Seattle, California, Washington, District of Columbia, Pennsylvania, Tulsa, Laredo, Iowa, Illinois, Texas, Portland (Oregon), Georgia, Indianapolis, Oregon, United States, Oklahoma, Rio Arriba, Alabama, Cleveland, Washington, Nebraska, Albany (New York), Omaha, Minneapolis, Woodbury, Atlanta, Colorado, Honolulu, New Orleans, Alaska, Phoenix, Denver, Salt Lake City, Dallas, Nevada, Des Moines, San Antonio, Chicago, Hawaii, Minnesota, New York (state), Birmingham, New Mexico, Louisiana, Anchorage, Ohio, Los Angeles, Philadelphia
Time period: 2002-01-01--2002-12-31
The goal of the Arrestee Drug Abuse Monitoring (ADAM) Program is to determine the extent and correlates of illicit drug use in the population of booked arrestees in local areas. Data were collected in 2002 at four separate times (quarterly) during the year in 36 metropolitan areas in the United States. The ADAM program adopted a new instrument in 2000 in adult booking facilities for male (Part 1) and female (Part 2) arrestees. Data from arrestees in juvenile detention facilities (Part 3) continued to use the juvenile instrument from previous years, extending back through the DRUG USE FORECASTING series (ICPSR 9477). The ADAM program in 2002 also continued the use of probability-based sampling for male arrestees in adult facilities, which was initiated in 2000. Therefore, the male adult sample includes weights, generated through post-sampling stratification of the data. For the adult files, variables fell into one of eight categories: (1) demographic data on each arrestee, (2) ADAM facesheet (records-based) data, (3) data on disposition of the case, including accession to a verbal consent script, (4) calendar of admissions to substance abuse and mental health treatment programs, (5) data on alcohol and drug use, abuse, and dependence, (6) drug acquisition data covering the five most commonly used illicit drugs, (7) urine test results, and (8) weights. The juvenile file contains demographic variables and arrestee's self-reported past and continued use of 15 drugs, as well as other drug-related behaviors.
Curated
Simple Crosstabs

The Community Vulnerability and Responses to Drug-User-Related HIV/AIDS, 1990-2013 [96 Metropolitan Statistical Areas, United States] (ICPSR 36575)

Released/updated on: 2017-08-08
Geographic coverage: North Carolina, Milwaukee, Indiana, Ocean (New Jersey), Fort Worth, Cincinnati, Austin, Monmouth (New Jersey), Utah, San Jose, Rock Hill, Gastonia, San Diego, Columbus (Ohio), Syracuse, Springfield (Massachusetts), North Little Rock (Arkansas), Arizona, Las Vegas, Arlington, Springfield (Ohio), Boston, San Bernardino, Providence, Seattle, Kentucky, St. Petersburg, Bethlehem, Niagara Falls (New York), Nashville, California, Florida, Delaware, Hunterdon (New Jersey), Boca Raton (Florida), Troy, Knoxville, Mississippi, Fresno, New Haven, Sarasota, Illinois, Newark, Georgia, Little Rock, Virginia, Maryland, Norfolk, Virginia Beach, Suffolk County (New York), United States, Oklahoma, Grand Rapids, Louisville, Waukesha (Wisconsin), Arkansas, Washington, South Carolina, Albany (New York), Wichita, Mesa (Arizona), Carlisle (Pennsylvania), Fall River, Massachusetts, Missouri, Winston-Salem, Holland (Michigan), New Orleans, Scranton, Denver, Salt Lake City, Harrisburg, Dallas, St. Louis, Nevada, Schenectady, Allentown, Raleigh, San Antonio, Muskegon, St. Paul, Clearwater, Hawaii, Rochester (New York), Passaic, Ventura (California), Birmingham, Michigan, Lebanon, Baltimore, New Mexico, Orlando, Louisiana, Toledo, Middlesex (New Jersey), Philadelphia, Riverside, Oklahoma City, Akron, Greensboro, Detroit, Charlotte, High Point, Tucson, Albuquerque, Everett, Oakland, Bakersfield, New York City, Somerset (New Jersey), Petersburg, Memphis, Ogden, Jacksonville, Buffalo, Pittsburgh, Nassau (New York), Orange County (California), Sacramento, El Paso, Greenville, Kansas, Meriden, Pennsylvania, Tulsa, Chapel Hill (North Carolina), West Palm Beach, Iowa, Texas, Lorain, Portland (Oregon), Hazleton, Tampa, Durham, San Marcos (Texas), Indianapolis, Richmond, Oregon, Warwick, Bergen (New Jersey), Newport News, Ann Arbor, Alabama, Cleveland, Dayton, Nebraska, Omaha, Warren, West Virginia, Elyria, Tacoma, Minneapolis, Youngstown, Atlanta, Honolulu, Phoenix, Bradenton, Wilmington (Delaware), Gary, District of Columbia, Rhode Island, Vancouver (Washington), Lodi (California), Chicago, Fort Lauderdale, Wilkes-Barre, Minnesota, Kansas City (Missouri), Bellevue, New York (state), Anderson, New Jersey, Miami, San Francisco, Charleston (South Carolina), Jersey City, Long Beach, Spartanburg (South Carolina), New Hampshire, Easton, Ohio, Los Angeles, Hartford, Stockton, Houston
Time period: 1990-01-01--2013-01-01

The Community Vulnerability and Responses to Drug-User-Related HIV/AIDS, 1990-2013 [96 Metropolitan Statistical Areas, United States] study (CVAR) was a research study of why large United States Metropolitan Statistical Areas (MSAs) vary over time in their vulnerability to HIV/AIDS among drug users and in MSA responses to HIV/AIDS. This collection contains estimates of HIV prevalence among people who injected drugs (PWID) and among sub-populations of PWID. This collection is comprised of ten datasets with differing amounts of variables and provides trend data that describe the following:

  • Epidemiologic outcomes including population prevalence of PWIDs and Non-injecting drug users (NIDUs), and particularly their prevalence among youth; and, among PWIDs, HIV prevalence, late-diagnosis HIV cases, and AIDS incidence and mortality.
  • Implementation of evidence-based drug-related interventions including drug abuse treatment, syringe exchange, HIV counseling and testing.
  • Implementation of non-evidence-based drug-related interventions including incarceration and arrests of drug users.

The collection contains data on the MSA sub-populations including Black, Hispanic, White and "other" race categories. In addition, some statistics are presented in age range categories such as ages 15-29, 30-64 and 15-64.

Curated
Restricted

Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2003 (ICPSR 4020)

Released/updated on: 2006-03-30
Geographic coverage: North Carolina, Oklahoma City, Charlotte, Indiana, Tucson, Albuquerque, Spokane, Utah, San Jose, New York City, San Diego, Arizona, Las Vegas, Boston, Sacramento, Seattle, California, Florida, Pennsylvania, Tulsa, Iowa, Illinois, Texas, Portland (Oregon), Georgia, Tampa, Indianapolis, Oregon, United States, Oklahoma, Rio Arriba, Alabama, Cleveland, Washington, Nebraska, Albany (New York), Omaha, Minneapolis, Woodbury, Atlanta, Massachusetts, Colorado, Honolulu, New Orleans, Alaska, Phoenix, Denver, Salt Lake City, Dallas, Nevada, Des Moines, District of Columbia, San Antonio, Chicago, Hawaii, Minnesota, New York (state), Birmingham, Miami, New Mexico, Louisiana, Anchorage, Ohio, Los Angeles, Philadelphia, Houston
Time period: 2003-01-01--2003-12-31
The goal of the Arrestee Drug Abuse Monitoring (ADAM) Program is to determine the extent and correlates of illicit drug use in the population of booked arrestees in local areas. Data were collected in 2003 up to four separate times (quarterly) during the year in 39 metropolitan areas in the United States. The ADAM program adopted a new instrument in 2000 in adult booking facilities for male (Part 1) and female (Part 2) arrestees. The ADAM program in 2003 also continued the use of probability-based sampling for male arrestees in adult facilities, which was initiated in 2000. Therefore, the male adult sample includes weights, generated through post-sampling stratification of the data. For the adult male and female files, variables fell into one of eight categories: (1) demographic data on each arrestee, (2) ADAM facesheet (records-based) data, (3) data on disposition of the case, including accession to a verbal consent script, (4) calendar of admissions to substance abuse and mental health treatment programs, (5) data on alcohol and drug use, abuse, and dependence, (6) drug acquisition data covering the five most commonly used illicit drugs, (7) urine test results, and (8) for males, weights.
Curated
Restricted

Arrestee Drug Abuse Monitoring (ADAM) Program in the United States, 2001 (ICPSR 3688)

Released/updated on: 2006-03-30
Geographic coverage: North Carolina, Oklahoma City, Detroit, Charlotte, Indiana, Tucson, Albuquerque, Spokane, Utah, San Jose, New York City, San Diego, Arizona, Las Vegas, Sacramento, Seattle, California, Pennsylvania, Tulsa, Laredo, Iowa, Illinois, Texas, Portland (Oregon), Indianapolis, Oregon, United States, Oklahoma, Alabama, Cleveland, Washington, Nebraska, Albany (New York), Omaha, Minneapolis, Colorado, Honolulu, Missouri, New Orleans, Alaska, Phoenix, Denver, Salt Lake City, Dallas, Nevada, Des Moines, San Antonio, Chicago, Hawaii, Minnesota, Kansas City (Missouri), New York (state), Birmingham, Michigan, New Mexico, Louisiana, Anchorage, Ohio, Philadelphia
Time period: 2001-01-01--2001-12-31
The goal of the Arrestee Drug Abuse Monitoring (ADAM) Program is to determine the extent and correlates of illicit drug use in the population of booked arrestees in local areas. Data were collected in 2001 at four separate times (quarterly) during the year in 33 metropolitan areas in the United States. The ADAM program adopted a new instrument in 2000 in adult booking facilities for male (Part 1) and female (Part 2) arrestees. Data from arrestees in juvenile detention facilities (Part 3) continued to use the juvenile instrument from previous years, extending back through the DRUG USE FORECASTING series (ICPSR 9477). The ADAM program in 2001 also continued the use of probability-based sampling for male arrestees in adult facilities, which was initiated in 2000. Therefore, the male adult sample includes weights, generated through post-sampling stratification of the data. For the adult files, variables fell into one of eight categories: (1) demographic data on each arrestee, (2) ADAM facesheet (records-based) data, (3) data on disposition of the case, including accession to a verbal consent script, (4) calendar of admissions to substance abuse and mental health treatment programs, (5) data on alcohol and drug use, abuse, and dependence (6) drug acquisition data covering the five most commonly used illicit drugs, (7) urine test results, and (8) weights. The juvenile file contains demographic variables and arrestee's self-reported past and continued use of 15 drugs, as well as other drug-related behaviors.
Curated
Simple Crosstabs

State Investments in Successful Transitions to Adulthood, 1970-2000 (ICPSR 34373)

Released/updated on: 2013-03-07
Geographic coverage: United States
Time period: 1971-01-01--2000-01-01

This research investigated the relationship between ascribed characteristics, family resources, personal circumstances, and public policies as these affect the transition to adulthood. The transition to adulthood has been extensively studied during the last four decades using a variety of well-established approaches and methods. Changes in the structure and pace of youth-to-adult transitions have been extensively documented, along with the increasingly complex lives young people lead as they negotiate the transition to adulthood. Relatively less attention has been devoted to the factors leading to these changes, and a variety of public policies related to state economic development efforts, education, and financial support for higher education have yet to be examined in any detail. This project built on the principal investigators' prior work on life course transitions and state economic and political contexts to estimate behavioral models of the late 20th and early 21st century transition to adulthood.

Specifically, this research:

  1. Defines and describes the successful transition to adulthood in terms of human capital accumulation, attainment of economic security, and partnership and life satisfaction.
  2. Identifies group and individual disparities in successful transitions, defined by ascribed characteristics, family resources, and personal circumstances.
  3. Measures the impact of the social and economic environments where these transitions occur and the effects of state structures and policies on the successful transition to adulthood, specifically examining whether the impact of these state policies differs by race/ethnicity, immigrant status, and disability status.

The analysis used discrete hazard modeling and hierarchical generalized linear modeling (HGLM) to build a general model of the transition to adulthood on a wide variety of dimensions (from educational attainment to stable employment in a full-time job, employment in a job with health insurance, to independent residence and life satisfaction) and examined systematic changes in the process leading to adulthood across cohorts and across race/ethnic, immigrant, and disability groups.

Curated
Restricted

Northwestern Juvenile Project, (Cook County, Illinois): Follow-up 1, 1998-2001 (ICPSR 34931)

Released/updated on: 2018-06-08
Geographic coverage: United States, Chicago, Illinois
Time period: 1977-01-01--2006-01-01

This study contains data from the first follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. This initial follow-up occurred approximately three years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.

The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.

The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.

Curated
Restricted

Dynamics of Retail Methamphetamine Markets in New York City, 2007-2009 (ICPSR 29821)

Released/updated on: 2014-01-06
Geographic coverage: New York City, United States, New York (state)
Time period: 2007-01-01--2009-12-31
The study was conducted to provide information about markets for, distribution of, and use of methamphetamine in New York City, both inside and outside of the MSM (men who have sex with men)/gay community. The study used Respondent Driven Sampling to recruit 132 methamphetamine market participants. Each respondent participated in a one to two hour structured interview combining both qualitative and quantitative responses. Each respondent was invited to recruit three additional eligible participants. Data collected included demographics, social network data, the respondent's market participation in obtaining and providing methamphetamine, consumption of methamphetamine, and experience with the criminal justice system and crime associated with participation in methamphetamine markets.
Curated
Restricted

Northwestern Juvenile Project (Cook County, Illinois): Follow-up 2, 1999 - 2005 (ICPSR 36629)

Released/updated on: 2018-06-08
Geographic coverage: United States, Chicago, Illinois
Time period: 1977-01-01--2005-01-01

This study contains data from the second follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. This second follow-up occurred approximately 3.5 years after the baseline interview and focused on the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risky behaviors.

The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Researchers studied changes in disorders over time (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. The NJP addressed the patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors are interrelated.

The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1,005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. A random subsample of 997 of the baseline participants were chosen for second follow-up interviews. Researchers tracked participants from the time they left detention and re-interviewed them regardless of where they were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.

The study was funded by OJJDP, several institutes at the National Institutes of Health, and other federal agencies and private foundations. The National Institutes of Health funded an additional component on HIV/AIDS risk behaviors.

Curated
Restricted

Northwestern Juvenile Project (Cook County, Illinois), Follow-up 4, 2000-2006 (ICPSR 36686)

Released/updated on: 2018-06-08
Geographic coverage: United States, Chicago, Illinois
Time period: 1977-01-01--2006-01-01

This study contains data from the fourth follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. The fourth follow-up occurred approximately 4.5 years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.

The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.

The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. All participants were eligible for fourth follow-up interviews. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.

Curated
Restricted

Northwestern Juvenile Project (Cook County, Illinois), Follow-up 3, 1999-2007 (ICPSR 36651)

Released/updated on: 2018-06-08
Geographic coverage: United States, Chicago, Illinois
Time period: 1977-01-01--2006-01-01

This study contains data from the third follow-up interview of the Northwestern Juvenile Project (NJP), a longitudinal assessment of alcohol, drug, or mental service treatment needs of juvenile detainees. The third follow-up occurred approximately four years after the baseline interview and focused on studying the development and persistence of psychiatric disorders, related predictive variables, patterns of drug use, and other risk behaviors.

The project's aims included studying (1) development and persistence of alcohol, drug, and mental disorders and (2) pathways and patterns of risky behaviors. Changes in disorders over time were studied (including onset, remission, and recurrence), comorbidity, associated functional impairments, and the risk and protective factors related to these disorders and impairments. This study addressed patterns and sequences of the development of drug use and related variables, focusing on gender differences, racial/ethnic differences, the antecedents of these risky behaviors (risk and protective factors), and how these behaviors were interrelated.

The original sample included 1829 randomly selected youth, 1172 males and 657 females, then 10 to 18 years old, enrolled in the study as they entered the Cook County Juvenile Temporary Detention Center from 1995 to 1998. Among the sample were 1005 African Americans, 524 Hispanics, 296 non-Hispanic white respondents. Participants were tracked from the time they left detention. A random subsample of 997 of the baseline participants were chosen for third follow-up interviews. Re-interviews were conducted regardless of where respondents were living when their follow-up interview was due: in the community, correctional settings, or by telephone if they lived farther than two hours from Chicago.

Curated
Restricted
Simple Crosstabs

Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files (ICPSR 38008)

Released/updated on: 2026-04-13
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2017-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). For Wave 1 (baseline), the study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete the Youth Interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Dataset 0001 (DS0001) contains the data from the Public-Use File Master Linkage File (PUF-MLF). This file contains 93 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files.

Dataset 0002 (DS0002) contains the data from the Restricted-Use File Master Linkage File (RUF-MLF). This file contains 217 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.

Curated
Restricted

Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Restricted-Use Files (ICPSR 37519)

Released/updated on: 2026-04-13
Geographic coverage: United States
Time period: 2017-01-01--2018-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.

Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).

Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.

Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,617 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth"

Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.

Dataset 1402 (DS1402) contains the Wave 4.5 State Identifier data for Youth and Parents and has 5 variables and 13,131 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in this dataset represent participants' state of residence at the time of Wave 4.5.

Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,619 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,871 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5 and 5.

The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.

Dataset 2401 (DS2401) contains the Wave 5.5 State Identifier data for Adults and has 5 variables and 3,628 cases. Dataset 2402 (DS2402) contains the Wave 5.5 State Identifier data for Youth and Parents and has 5 variables and 7,129 cases. The same 5.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.5.

Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, and Wave 5.5 indicating if participants had ever/never used various tobacco products as of the Wave 5.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 977 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.

Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.

The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH_-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.

Dataset 3401 (DS3401) contains the PATH-ATS State Identifier data and has 5 variables and 8,874 cases. The State Identifier dataset includes PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in this dataset represents participants' state of residence at the time of PATH-ATS.

Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 3,142 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 2,169 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.

There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.

Dataset 4401 (DS4401) contains the Wave 7.5 State Identifier data for Adults and has 5 variables and 7,961 cases. Dataset 4402 (DS4402) contains the Wave 7.5 State Identifier data for Youth and Parents and has 5 variables and 8,949 cases. The same 7.5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.5.

Dataset 4503 (DS4503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, PATH-ATS, Wave 6, Wave 7, and Wave 7.5 indicating if participants had ever/never used various tobacco products as of the Wave 7.5 data collection period. This data file contains 25 variables for all 82,139 study participants as of the Wave 7.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 7.5. This data file contains 53 variables and 157 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7.5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.5.

Curated
Simple Crosstabs

Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Public-Use Files (ICPSR 37786)

Released/updated on: 2025-06-27
Geographic coverage: United States
Time period: 2017-01-01--2018-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave.

Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS).

Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave.

Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,395 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who completed an interview in Wave 1 and in Wave 4.5, regardless of their participation in the intervening waves. The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 4.5 Youth Interview respondents in the Wave 4 Cohort.

Dataset 1503 (DS1503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, and Wave 4.5 indicating if participants had ever/never used various tobacco products as of the Wave 4.5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 4.5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 2001 (DS2001) contains the data from the Wave 5.5 Adult Questionnaire. This file contains 2,323 variables and 3,628 cases. Of these cases, 1,014 are continuing adults having completed a prior Adult Questionnaire. The other 2,614 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 5.5 Youth and Parent Questionnaire. This file contains 1,625 variables and 7,129 cases. Of these cases, 7,076 are continuing youth having completed a prior Youth Interview. The other 53 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 2111, 2112, 2121, 2122, 2221, and 2222 (DS2111, DS2112, DS2121, DS2122, DS2221, and DS2222) are data files comprising the weight variables for Wave 5.5. In Wave 5.5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 5.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, and 5.

The "single-wave" weight file for the Wave 4 Cohort contains weights for all Wave 5.5 interview respondents.

Dataset 3001 (DS3001) contains the data from PATH-ATS. This file contains 908 variables and 8,874 cases, all of which are continuing adults having completed a prior Adult Questionnaire, with their most recent interview in Wave 5.

Datasets 3111 and 3121 (DS3111 and DS3121) are data files comprising weights for PATH-ATS. In PATH-ATS, weight variables are in individual files corresponding to the Wave 1 and Wave 4 Cohorts.

The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed an interview in PATH-ATS and completed interviews in Waves 1, 2, 3, 4, and 5. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed an interview in PATH-ATS; all PATH-ATS respondents completed interviews in Wave 4 and Wave 5.

Dataset 2503 (DS2503) contains data derived from responses to questionnaires in Wave 1, Wave 2, Wave 3, Wave 4, Wave 4.5, Wave 5, Wave 5.5, and PATH-ATS, indicating if participants had ever/never used various tobacco products as of the Wave 5.5/PATH-ATS data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5.5/PATH-ATS data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4001 (DS4001) contains the data from the Wave 7.5 Adult Questionnaire. This file contains 2,760 variables and 7,961 cases. Of these cases, 5,952 are continuing adults having completed a prior Adult Questionnaire. The other 2,009 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 4002 (DS4002) contains the data from the Wave 7.5 Youth and Parent Questionnaire. This file contains 1,889 variables and 8,949 cases. Of these cases, 7,064 are continuing youth having completed a prior Youth Interview. The other 1,885 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 4111, 4112, 4121, 4122, 4221, 4222, 4231, and 4232 (DS4111, DS4112, DS4121, DS4122, DS4221, DS4222, DS4231, and DS4232) are data files comprising the weight variables for Wave 7.5. In Wave 7.5, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight file for the Wave 1 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 4.5, 5, 5.5, 6, and 7. The "all-waves" weight file for the Wave 4 Cohort contains weights for participants who completed a Wave 7.5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 4.5, 5, 5.5, 6, and 7.

There are two separate sets of files with "single-waves" weights: one for the Wave 4 Cohort and one for the Wave 7 Cohort. The "single-wave" weight file for the Wave 4 Cohort contains weights for Wave 7.5 interview respondents in the Wave 4 Cohort, regardless of their response status at Waves 4.5, 5, 5.5, 6, or 7. The "single-wave" weight file for the Wave 7 Cohort contains weights for all Wave 7.5 interview respondents in the Wave 7 Cohort.

Curated
Restricted

Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files (ICPSR 36231)

Released/updated on: 2026-04-21
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study.

Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview.

Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases.

Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment.

Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 1.

Dataset 1801 (DS1801) contains Location Characteristics for Wave 1 Adults. This data file contains 4 variables and 32,320 cases.

Dataset 1802 (DS1802) contains Location Characteristics for Wave 1 Youth. This data file contains 4 variables and 13,651 cases.

Dataset 1901 (DS1901) contains Study Research Derived Variables for Wave 1 Adults created by PATH Study analysts. This data file contains 104 variables and 32,320 cases.

Dataset 1902 (DS1902) contains Study Research Derived Variables for Wave 1 Youth created by PATH Study analysts. This data file contains 89 variables and 13,651 cases.

Dataset 2011 (DS2011) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,421 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.

Dataset 2012 (DS2012) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,596 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."

Dataset 2411 (DS2411) contains the Wave 2 State Identifier data for Adults and has 5 variables and 28,362 cases. Dataset 2412 (DS2412) contains the Wave 2 State Identifier data for Youth and Parents and has 5 variables and 12,172 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 2.

Dataset 2611 (DS2611) contains the Tobacco Universal Product Code (UPC) data from Wave 2. This data file contains 32 variables and 7,295 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 2. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 2.

Dataset 2801 (DS2801) contains Location Characteristics for Wave 2 Adults. This data file contains 4 variables and 28,362 cases.

Dataset 2802 (DS2802) contains Location Characteristics for Wave 2 Youth. This data file contains 4 variables and 12,172 cases.

Dataset 2901 (DS2901) contains Study Research Derived Variables for Wave 2 Adults created by PATH Study analysts. This data file contains 178 variables and 28,362 cases.

Dataset 2902 (DS2902) contains Study Research Derived Variables for Wave 2 Youth created by PATH Study analysts. This data file contains 123 variables and 12,172 cases.

Dataset 3011 (DS3011) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,359 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 3012 (DS3012) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,492 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 3111, 3211, 3112, and 3212 (DS3111, DS3211, DS3112, and DS3212) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, starting with Wave 3, the weight variables have been separated into individual data files. The "all-waves" weight files contain weights for respondents who completed an interview for all waves in which they were old enough to do so or verified their information with the study for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.

Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 3411 (DS3411) contains the Wave 3 State Identifier data for Adults and has 5 variables and 28,148 cases. Dataset 3412 (DS3412) contains the Wave 3 State Identifier data for Youth and Parents and has 5 variables and 11,814 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 3.

Dataset 3611 (DS3611) contains the Tobacco Universal Product Code (UPC) data from Wave 3. This data file contains 32 variables and 6,768 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 3. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 3.

Dataset 3801 (DS3801) contains Location Characteristics for Wave 3 Adults. This data file contains 4 variables and 28,148 cases.

Dataset 3802 (DS3802) contains Location Characteristics for Wave 3 Youth. This data file contains 4 variables and 11,814 cases.

Dataset 3901 (DS3901) contains Study Research Derived Variables for Wave 3 Adults created by PATH Study analysts. This data file contains 107 variables and 28,148 cases.

Dataset 3902 (DS3902) contains Study Research Derived Variables for Wave 3 Youth created by PATH Study analysts. This data file contains 88 variables and 11,814 cases.

Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,504 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,600 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 4111, 4211, 4321, 4112, 4212, and 4322 (DS4111, DS4211, DS4321, DS4112, DS4212, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.

Dataset 4401 (DS4401) contains the Wave 4 State Identifier data for Adults and has 5 variables and 33,822 cases. Dataset 4402 (DS4402) contains the Wave 4 State Identifier data for Youth and Parents and has 5 variables and 14,798 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 4. For adults and youth from the replenishment sample, the values also represent state of residence at the time of recruitment.

Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires, indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4601 (DS4601) contains the Tobacco Universal Product Code (UPC) data from Wave 4. This data file contains 32 variables and 7,684 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 4. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 4.

Dataset 4801 (DS4801) contains Location Characteristics for Wave 4 Adults. This data file contains 4 variables and 33,822 cases.

Dataset 4802 (DS4802) contains Location Characteristics for Wave 4 Youth. This data file contains 4 variables and 14,798 cases.

Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,606 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire and 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,776 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview and 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.

Dataset 5401 (DS5401) contains the Wave 5 State Identifier data for Adults and has 5 variables and 34,309 cases. Dataset 5402 (DS5402) contains the Wave 5 State Identifier data for Youth and Parents and has 5 variables and 12,098 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 5.

Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 5601 (DS5601) contains the Tobacco Universal Product Code (UPC) data from Wave 5. This data file contains 33 variables and 6,678 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 5. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 5.

Dataset 5801 (DS5801) contains Location Characteristics for Wave 5 Adults. This data file contains 4 variables and 34,309 cases.

Dataset 5802 (DS5802) contains Location Characteristics for Wave 5 Youth. This data file contains 4 variables and 12,098 cases.

Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,935 variables and 30,516 cases

Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 2,080 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth Interview and 60 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS62221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

Dataset 6401 (DS6401) contains the Wave 6 State Identifier data for Adults and has 5 variables and 30,516 cases. Dataset 6402 (DS6402) contains the Wave 6 State Identifier data for Youth and Parents and has 5 variables and 5,652 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 6.

Dataset 6503 (DS6503) contains data derived from responses to questionnaires in Waves 1-6 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 6601 (DS6601) contains the Tobacco Universal Product Code (UPC) data from Wave 6. This data file contains 53 variables and 5,408 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 6. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 6.

Dataset 6801 (DS6801) contains Location Characteristics for Wave 6 Adults. This data file contains 4 variables and 30,516 cases.

Dataset 6802 (DS6802) contains Location Characteristics for Wave 6 Youth. This data file contains 4 variables and 5,652 cases.

Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 3,221 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 2,171 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.

Dataset 7401 (DS7401) contains the Wave 7 State Identifier data for Adults and has 5 variables and 30,801 cases. Dataset 7402 (DS7402) contains the Wave 7 State Identifier data for Youth and Parents and has 5 variables and 10,834 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 7.

Dataset 7503 (DS7503) contains data derived from responses to questionnaires in Waves 1-7 (including the special collections in Wave 4.5, Wave 5.5, and PATH-ATS) indicating if participants had ever/never used various tobacco products as of the Wave 7 data collection period. This data file contains 26 variables for all 82,139 study participants as of the Wave 7 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 7601 (DS7601) contains the Tobacco Universal Product Code (UPC) data from Wave 7. This data file contains 53 variables and 4,533 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 7. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used by these respondents at the time of Wave 7.

Dataset 7801 (DS7801) contains Location Characteristics for Wave 7 Adults. This data file contains 4 variables and 30,801 cases.

Dataset 7802 (DS7802) contains Location Characteristics for Wave 7 Youth. This data file contains 4 variables and 10,834 cases.

Dataset 8001 (DS8001) contains the data from the Wave 8 Adult Questionnaire. This data file contains 3,467 variables and 31,477 cases. Of these cases, 30,021 are continuing adults having completed a prior Adult Questionnaire and 1,456 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 8002 (DS8002) contains the data from the Wave 8 Youth and Parent Questionnaire. This data file contains 2,393 variables and 8,002 cases. Of these cases, 7,046 are continuing youth having completed a prior Youth Interview and 956 are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 8111, 8121, 8122, 8211, 8221, 8231, 8232, 8711, 8721, 8722, 8731, and 8732 (DS8111, DS8121, DS8122, DS8211, DS8221, DS8231, DS8232, DS8711, 8DS721, DS8722, DS8731, and DS8732) are data files comprising the weight variables for Wave 8. In Wave 8, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and 7. Note that only adults have "all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and 7.

There are three separate sets of files with "single-wave" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 8, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 8, regardless of their participation in the intervening waves. Note that only adults have "single-wave" weights for the Wave 1 and Wave 4 Cohorts; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8 and youth from the Wave 4 Cohort were selected as shadow youth so they do not have any interview data from Wave 4. The "single wave" weights files for the Wave 7 Cohort contain weights for participants who completed an interview in Wave 7 and in Wave 8.

There are also three separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort, one for the Wave 4 Cohort, and one for the Wave 7 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, 7 and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. Note that only adults have "special collection all-waves" weights for the Wave 1 Cohort; youth from the Wave 1 Cohort aged-up to adults by the time of Wave 8. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, 7, and the special collections in Wave 4.5, Wave 5.5, and Wave 7.5. The "special collection all-waves" weight files for the Wave 7 Cohort contain weights for participants who completed a Wave 8 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 7 and the special collection in Wave 7.5.

Dataset 8401 (DS8401) contains the Wave 8 State Identifier data for Adults and has 5 variables and 31,477 cases. Dataset 8402 (DS8402) contains the Wave 8 State Identifier data for Youth and Parents and has 5 variables and 8,002 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state FIPS, state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 8.

Dataset 8801 (DS8801) contains Location Characteristics for Wave 8 Adults. This data file contains 4 variables and 31,477 cases.

Dataset 8802 (DS8802) contains Location Characteristics for Wave 8 Youth. This data file contains 4 variables and 8,002 cases.

Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. In all waves of data collection, less than 0.5 percent of the parents did not complete an interview. Most questions are asked about the child.

When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).

Curated
Restricted

Population Assessment of Tobacco and Health (PATH) Study [United States] Biomarker Restricted-Use Files (ICPSR 36840)

Released/updated on: 2025-12-10
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The Population Assessment of Tobacco and Health (PATH) Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study was launched in 2011 to inform the FDA's tobacco regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). For Wave 1 (baseline), the PATH Study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco, yielding interviews with 45,971 adult and youth respondents.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled PSUs and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Biospecimen Collection

Each adult respondent, who completed the interview at Wave 1, was asked to provide at least two biospecimens. Providing biospecimens was voluntary and was not a condition of participation. Respondents were asked to report their use of all nicotine-containing products during the 3-day period prior to the time of any biospecimen collection (Nicotine Exposure Questions (NEQs)) to facilitate interpretation of biomarker results.

Of the 32,320 respondents who completed the Adult Interview at Wave 1, 21,801 (67.4 percent) provided a urine specimen and 14,520 (44.9 percent) provided a blood specimen. For the purposes of subsampling adults into the Wave 1 Biomarker Core, adult participants were grouped by tobacco product use at Wave 1 into nine mutually exclusive groups.

A sample of 11,522 adults who provided sufficient urine for the planned analyses were selected from the first six tobacco product use groups (see section 3.1 of the Biomarker Restricted-Use Files User Guide) representing people who never used tobacco, currently use tobacco, and formerly used tobacco (within the last 12 months). This group constitutes the original Wave 1 Biomarker Core. Of the 11,522 adults, 7,159 also provided a blood specimen. All urine and blood specimens provided by the Wave 1 Biomarker Core were sent for laboratory analysis.

Subsequent to this selection, an additional stratified probability sample of adults who completed the Wave 1 Adult Interview and provided a sufficient amount of urine for the planned analyses at Wave 1 (independent of whether they provided a blood specimen) was selected from the remaining three product use groups (see section 3.1 of the Biomarker Restricted-Use Files User Guide). Wave 1 blood and urine specimens from this expansion sample were also sent for laboratory analysis. The original and expansion samples together form the expanded Wave 1 Biomarker Core. The expansion sample did not provide urine specimens for laboratory analysis again until Wave 7.

Each youth who completed the Wave 4 interview was asked to provide a urine specimen. Each Wave 4 shadow youth (ages 10 and 11 at Wave 4) who completed the Wave 5 youth interview was also asked to provide a urine specimen. Providing this urine biospecimen was voluntary and was not a condition of participation.

Of the 14,798 respondents who completed the Youth Interview at Wave 4, 13,097 (88.5 percent) provided a urine specimen. A sample of 3,509 Wave 4 Cohort youth ages 12 to 17 who completed the Wave 4 Youth Interview and provided a sufficient amount of urine for the planned laboratory analyses was selected from a diverse mix of five tobacco product use and non-use groups. In addition, a sample of 528 Wave 4 shadow youth who completed a Wave 5 interview and provided a sufficient amount of urine for the planned laboratory analyses at Wave 5 was also selected. These 4,037 sampled youth and shadow youth constitute the Wave 4 Biomarker Core. All urine specimens provided by the Wave 4 Biomarker Core were sent for laboratory analysis.

As members of the Wave 1 and Wave 4 Biomarker Cores age over time, a new Wave 7 Biomarker Core was designed to provide nationally representative estimates for the U.S. civilian noninstitutionalized adult (ages 18 and older) population (CNP) at the time of Wave 7 (2022-2023). To that end, Aat the conclusion of Wave 7, a new biomarker core was selected from Wave 7 Cohort adults who completed an interview and provided a urine specimen at Wave 7. The Wave 7 Biomarker Core sample selection was a two-stage process. Prior to the start of data collection, a subsample of continuing participants expected to be adults at the time of their Wave 7 interview, including some participants who were part of the Wave 1 or Wave 4 Biomarker Cores, was selected and flagged for urine collection; additionally, a subsample of replenishment sample address was selected and flagged so that any Wave 7 Adult Interview respondents living at the selected addresses would be asked to provide a urine specimen. Of the 10,698 Adult Interview respondents from these subsamples, 9,187 (85.9 percent) provided a urine specimen. A sample of 7,750 Wave 7 Cohort adults who completed the Wave 7 Adult Interview and provided a sufficient amount of urine for the planned laboratory analyses was selected from six mutually exclusive and exhaustive tobacco use groups (see section 3.3 of the Biomarker Restricted-Use Files User Guide). All urine specimens provided by the Wave 7 Biomarker Core were sent for laboratory analysis.

Biomarker Restricted Use Files

Wave 1 Restricted-Use Biomarker Data Files (Biomarker RUF) consists of three different types of files for the Wave 1 Biomarker Core:

  • 2 Collection and NEQ files for Urine (DS1001) and Blood (DS1101)
  • 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS1021) and Blood (DS1121). Both files are updated to include records for the expanded Wave 1 Biomarker Core.
  • 8 Urine Panels (DS1031 to DS1038), 4 Serum Panels (DS1131 to DS1134) and 1 Plasma Panel (DS1231) containing biomarker assay results. 6 Urine Panels (DS1032, DS1033, DS1035, DS1036, DS1037, and DS1038) and 2 Serum Panels (DS1131 and DS1132) are updated to include records for the expanded Wave 1 Biomarker Core.

All files updated to include records for the expanded Wave 1 Biomarker Core contain an indicator R01_A_W1BC_TYPE (1 = Original, 2 = Expansion) to identify respondents in the Wave 1 Biomarker Core original and expansion subsamples.

For Wave 2, urine biospecimens were requested from the original Wave 1 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.

The Wave 2 Biomarker RUF consists of three different types of files:

  • 1 Collection and NEQ file for Urine (DS2001)
  • 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS2021) and F2PG2a (DS2022)
  • 8 Urine Panels (DS2031 to DS2038) containing biomarker assay results.

For Wave 3, urine biospecimens were requested from the original Wave 1 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.

The Wave 3 Biomarker RUF consists of three different types of files:

  • 1 Collection and NEQ file for Urine (DS3001)
  • 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS3021 and DS3022) and F2PG2a (DS3023 and DS3024).
  • 7 Urine Panels (DS3032 to DS3038) containing biomarker assay results.

For Wave 4, urine biospecimens were requested from the original Wave 1 Biomarker Core and all youth who completed the Wave 4 interview. Respondents were also asked to complete the NEQs prior to biospecimen collection.

The Wave 4 Biomarker RUF consists of the following files for each Biomarker Core:

Wave 1 Biomarker Core:

  • 1 Collection and NEQ file for Urine (DS4001)
  • 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS4021 and DS4022) and F2PG2a (DS4023 and DS4024).
  • 7 Urine Panels (DS4032, DS4033, DS4034, DS4035, DS4036, DS4037 and DS4038) containing biomarker assay results.

Wave 4 Biomarker Core:

  • 1 Collection and NEQ file for Youth Urine (DS4011)
  • 1 Biomarker Weight files including variables for use in variance estimation for Urine (DS4043)
  • 7 Urine Panels (DS4051, DS4053, DS4054, DS4055, DS4056, DS4057 and DS4058) containing biomarker assay results.

For Wave 5, urine biospecimens were requested from the original Wave 1 Biomarker Core and the Wave 4 Biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.

The Wave 5 Biomarker RUF consists of the following files for each Biomarker Core:

Wave 1 Biomarker Core:

  • 1 Collection and NEQ file for Urine (DS5001)
  • 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS5021 and DS5022) and F2PG2a (DS5023 and DS5024)
  • 6 Urine Panels (DS5032, DS5033, DS5035, DS5036, DS5037, and DS5038) containing biomarker assay results.

Wave 4 Biomarker Core:

  • 1 Collection and NEQ file for Youth Urine (DS5011)
  • 1 Collection and NEQ file for Adult Urine (DS5001)
  • 1 Biomarker Weight file including variables for use in variance estimation for Urine (DS5042)
  • 7 Urine Panels (DS5051, DS5053, DS5054, DS5055, DS5056, DS5057, and DS5058) containing biomarker assay results.

Note that the initial release of 3 Urine Panels and Biomarker weights for the Wave 4 Biomarker Core only included records for those among the 3,509 members who responded in Wave 5 and provided urine specimens in sufficient quantities for laboratory analyses. As of version 20, the Wave 5 biomarker data files and weights include data for all Wave 4 Biomarker Core members who provided urine specimens at Wave 5 in sufficient quantities for laboratory analyses, including the Wave 4 shadow youth who completed their first interviews at Wave 5. This means that records were added to previously released urine panel data files (DS5051, DS5053, and DS5056) and biomarker weights (DS5042) to include data for the Wave 4 shadow youth (N=528) who completed their first interviews at Wave 5. All panels released in version 20 and beyond will include records for the complete Wave 4 Biomarker Core.

Also note that the Collection and NEQ file for Adult Urine (DS5001) includes data for both the Wave 1 Biomarker Core and Wave 4 Biomarker Core.

For Wave 7, urine biospecimens were requested from the Wave 1 Biomarker Core, the Wave 4 Biomarker Core, and those in the subsample eligible for the Wave 7 biomarker Core. Respondents were also asked to complete the NEQs prior to biospecimen collection.

The Wave 7 Biomarker RUF consists of the following files for each Biomarker Core:

Wave 1 Biomarker Core:

  • 1 Collection and NEQ file for Urine (DS7001)
  • 4 Biomarker Weight files including variables for use in variance estimation for Urine (DS7021 and DS7022) and F2PG2a (DS7023 and DS7024)
  • 6 Urine Panels (DS7032, DS7033, DS7035, DS7036, DS7037, and DS7038) containing biomarker assay results.

Wave 4 Biomarker Core:

  • 1 Collection and NEQ file for Youth Urine (DS7011)
  • 1 Collection and NEQ file for Adult Urine (DS7001)
  • 2 Biomarker Weight files including variables for use in variance estimation for Urine (DS7041 and DS7042)
  • 6 Urine Panels (DS7051, DS7053, DS7055, DS7056, DS7057, and DS7058) containing biomarker assay results.

Wave 7 Biomarker Core:

  • 1 Collection and NEQ file for Urine (DS7001)
  • 1 Biomarker Weight file including variables for use in variance estimation for Urine (DS7061)
  • 4 Urine Panels (DS7072, DS7073, DS7076, DS7077) containing biomarker assay results.

The Collection and NEQ file for Adult Urine (DS7001) includes data for the Wave 1 Biomarker Core, Wave 4 Biomarker Core, and Wave 7 Biomarker Core.

Please refer to the Biomarker Restricted-Use Files User Guide for additional information about the Biomarker Cores.

References to the collection of biospecimens will be specified by the collected specimen, i.e., urine and (whole) blood. However, references to biomarker analyses and analytes will be specified by the type of matrix (serum, plasma, or urine) used for the analysis.

Curated
Simple Crosstabs

Population Assessment of Tobacco and Health (PATH) Study [United States] Public-Use Files (ICPSR 36498)

Released/updated on: 2025-04-08
Geographic coverage: United States
Time period: 2013-01-01--2014-01-01, 2014-01-01--2015-01-01, 2015-01-01--2016-01-01, 2016-01-01--2018-01-01, 2018-01-01--2019-01-01, 2022-01-01--2023-01-01

The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco.

45,971 adults and youth constitute the first (baseline) wave of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 14 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave.

At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort.

Please refer to the Public-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts.

Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview.

Dataset 1002 (DS1002) contains the data from the Youth and Parent Questionnaire. This file contains 1,228 variables and 13,651 cases.

Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.

Dataset 2002 (DS2002) contains the data from the Wave 2 Youth and Parent Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."

Dataset 3001 (DS3001) contains the data from the Wave 3 Adult Questionnaire. This data file contains 2,139 variables and 28,148 cases. Of these cases, 26,241 are continuing adults having completed a prior Adult Questionnaire. The other 1,907 cases are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 3002 (DS3002) contains the data from the Wave 3 Youth and Parent Questionnaire. This data file contains 1,309 variables and 11,814 cases. Of these cases, 9,769 are continuing youth having completed a prior Youth Interview. The other 2,045 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 3101, 3102, 3201 and 3202 (DS3101, DS3102, DS3201, and DS3202) are data files comprising the weight variables for Wave 3. The weight variables for Wave 1 and Wave 2 are included in the main data files. However, in Wave 3, the weight variables have been separated into individual data files for Adult and Youth Questionnaires. The "all-waves" weight files contain weights for those respondents who have completed an interview during all three waves of data collection. The "single-wave" weight files contain weights for all respondents in Wave 3 regardless of their participation in previous waves.

Dataset 3503 (DS3503) contains data derived from responses to Wave 1-3 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 3 study period. This data file contains 25 variables for all 53,178 study participants as of Wave 3. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 4001 (DS4001) contains the data from the Wave 4 Adult Questionnaire. This data file contains 2,182 variables and 33,822 cases. Of these cases, 25,857 are continuing adults having completed a prior Adult Questionnaire, 1,900 are "aged-up adults" having previously completed a Youth Questionnaire, and 6,065 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 4002 (DS4002) contains the data from the Wave 4 Youth and Parent Questionnaire. This data file contains 1,389 variables and 14,798 cases. Of these cases, 9,365 are continuing youth having completed a prior Youth Interview, 1,694 cases are "aged-up youth" having previously been sampled as "shadow youth," and 3,739 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 4111, 4112, 4211, 4212, 4321, and 4322 (DS4111, DS4112, DS4211, DS4212, DS4321, and DS4322) are data files comprising the weight variables for Wave 4. In Wave 4, the weight variables have been separated into individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort respondents who completed an interview for all waves in which they were old enough or verified their information for waves in which they were not old enough to be interviewed. The "single-wave" weight files contain weights for Wave 1 Cohort respondents at Wave 4 who completed an interview at Wave 1, regardless of their participation in previous waves. The "cross-sectional" weight files contain weights for all respondents in the Wave 4 Cohort.

Dataset 4503 (DS4503) contains data derived from responses to Wave 1-4 questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 4 data collection period. This data file contains 27 variables for all 67,276 study participants as of the Wave 4 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Dataset 5001 (DS5001) contains the data from the Wave 5 Adult Questionnaire. This data file contains 2,315 variables and 34,309 cases. Of these cases, 29,876 are continuing adults having completed a prior Adult Questionnaire, 4,433 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 5002 (DS5002) contains the data from the Wave 5 Youth and Parent Questionnaire. This data file contains 1,530 variables and 12,098 cases. Of these cases, 10,446 are continuing youth having completed a prior Youth Interview, 1,652 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 5111, 5112, 5211, 5212, 5221, 5222, 5711, 5712, 5721, and 5722 (DS5111, DS5112, DS5211, DS5212, DS5221, DS5222, DS5711, DS5712, DS5721, and DS5722) are data files comprising the weight variables for Wave 5. In Wave 5, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. The "all-waves" weight files contain weights for those Wave 1 Cohort participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, and 4.

Dataset 5503 (DS5503) contains data derived from responses to Wave 1-5 (including Wave 4.5) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 5 data collection period. This data file contains 26 variables for all 67,276 study participants as of the Wave 5 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

There are two separate sets of files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 5, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for all Wave 5 interview respondents in the Wave 4 Cohort.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contains weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and the special collection in Wave 4.5. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 5 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Wave 4 and the special collection in Wave 4.5.

Dataset 6001 (DS6001) contains the data from the Wave 6 Adult Questionnaire. This data file contains 2,589 variables and 30,516 cases. Of these cases, 28,852 are continuing adults having completed a prior Adult Questionnaire and 1,664 are "aged-up adults" having previously completed a Youth Questionnaire.

Dataset 6002 (DS6002) contains the data from the Wave 6 Youth and Parent Questionnaire. This data file contains 1,822 variables and 5,652 cases. Of these cases, 5,622 are continuing youth having completed a prior Youth interview and 30 cases are "aged-up youth" having previously been sampled as "shadow youth."

Datasets 6111, 6112, 6121, 6122, 6211, 6212, 6221, 6222, 6711, 6712, 6721, and 6722 (DS6111, DS6112, DS6121, DS6122, DS6211, DS6212, DS6221, DS6222, DS6711, DS6712, DS6721, and DS6722) are data files comprising the weight variables for Wave 6. In Wave 6, the weight variables are in individual data files corresponding to the Wave 1 and Wave 4 Cohorts and different weight types. There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, and 5. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 6, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 6, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 6 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4 and 5, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

Dataset 7001 (DS7001) contains the data from the Wave 7 Adult Questionnaire. This data file contains 2,813 variables and 30,801 cases. Of these cases, 27,258 are continuing adults having completed a prior Adult Questionnaire, 1,740 are "aged-up adults" having previously completed a Youth Questionnaire, and 1,803 are "replenishment sample adults" (also known as "new cohort adults" in the annotated instrument).

Dataset 7002 (DS7002) contains the data from the Wave 7 Youth and Parent Questionnaire. This data file contains 1,897 variables and 10,834 cases. Of these cases, 3,512 are continuing youth having completed a prior Youth Interview, 1 case is an "aged-up youth" having previously been sampled as "shadow youth," and 7,321 are "replenishment sample youth" (also known as "new cohort youth" in the annotated instrument).

Datasets 7111, 7112, 7121, 7122, 7211, 7212, 7221, 7222, 7331, 7332, 7711, 7712, 7721, and 7722 (DS DS7111, DS7112, DS7121, DS7122, DS7211, DS7212, DS7221, DS7222, DS7331, DS7332, DS7711, DS7712, DS7721, and DS7722) are data files comprising the weight variables for Wave 7. In Wave 7, the weight variables are in individual data files corresponding to the Wave 1, Wave 4, and Wave 7 Cohorts and different weight types.

There are two separate sets of files with "all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, and 6. The "all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, and 6.

There are two separate sets of files with "single-wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight files for the Wave 1 Cohort contain weights for participants who completed an interview in Wave 1 and in Wave 7, regardless of their participation in the intervening waves. The "single-wave" weight files for the Wave 4 Cohort contain weights for participants who completed an interview in Wave 4 and in Wave 7, regardless of their participation in the intervening waves.

There are also two separate sets of files with "special collection all-waves" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "special collection all-waves" weight files for the Wave 1 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 1, 2, 3, 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS. The "special collection all-waves" weight files for the Wave 4 Cohort contain weights for participants who completed a Wave 7 interview and completed interviews (if old enough to do so) or verified their information (if not old enough to be interviewed) in Waves 4, 5, 6, and the special collections in Wave 4.5, and Wave 5.5 or PATH-ATS.

The "cross-sectional" weight files contain weights for all respondents in the Wave 7 Cohort.

Dataset 6503 (DS6503) contains data derived from responses to Wave 1-6 (including Wave 4.5, Wave 5.5, and PATH-ATS) questionnaires indicating if participants had ever/never used various tobacco products as of the Wave 6 data collection period. This data file contains 24 variables for all 67,276 study participants as of the Wave 6 data collection. This file is provided for reference only to simplify the definitions of tobacco use variables in the Adult and Youth data files for subsequent waves.

Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. Across all waves of data collection, an average of 0.6 percent of the parents did not complete an interview. Most questions are asked about the child.

When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).