Version Date: May 1, 2018 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
United States Department of Health and Human Services. National Institutes of Health. National Institute on Drug Abuse;
United States Department of Health and Human Services. Food and Drug Administration. Center for Tobacco Products
Series:
https://doi.org/10.3886/ICPSR36498.v7
Version V7 (see more versions)
You are currently viewing an older version of this data collection. A more recent version may be available by selecting (see more versions)
Additional information about this collection can be found in Version History.
2018-05-01
Wave 1 and Wave 2 Adult and Youth data files were updated to improve the clarity and consistency of variable labels, especially in the Nicotine Dependence section.
A new variable was added to Wave 1 and Wave 2 Adult data - R0#_ND_DATA_ROUTE. A second variable was added to the Wave 2 Adult data - R02R_A_P12M_BLUNTONLY_GRILLO. An additional 18 derived variables in the Wave 2 Adult data were revised and replaced the original variables. The newly named variables possess the original name, but also contain "_REV" at the end of the variable name.
A skip error was identified in the Wave 2 Adult instrument, which resulted in some respondents being asked two questions when they should not have been. Therefore, the affected items, R02_AG0100CG and R02_AG0100FC, contain some extra data. Notes were added to the annotated instrument and codebook to describe the issue.
The User Guide and Questionnaires were also updated to improve understanding of the data files. A Nonresponse Bias Analysis report is now included for Wave 2.
2018-02-15 The citation of this study may have changed due to the new version control system that has been implemented. The previous citation was:
2017-06-14 The Wave 1 data files were updated to correct minor errors along with the questionnaires to correct minor typos and clarify specifications. The Wave 2 data files, questionnaires, and codebooks were added to the study collection. Also, the Master Linkage data file was added to facilitate merging respondent records across waves. The User Guide and Master Tobacco Brand and Product Code Guide were expanded to include information about Wave 2.
2017-04-27 A minor revision was made to the Wave 1 Adult questionnaire. Two Excel crosswalks, one for Adults and one for Youth, were added to the available documentation to highlight the differences between the Wave 1 and Wave 2 files.
2017-04-03 An update was made to internal files to correct an issue with how missing values are displayed online through ICPSR's variables database.
2017-01-31 The variable R01X_CB_REGION in both the Wave 1 Adult and Youth/Parent files was updated to correct an error in the value labels. The values for codes 2 and 3 had been inadvertently swapped. The data did not change; only the value labels for codes 2 and 3 have been corrected.
2016-11-28 An additional 40 derived variables were added to the end of the Wave 1 Youth / Parent file that are similar to those already in the Wave 1 Adult file. Information for individuals who withdrew from the study is denoted in the datasets by the special missing value -97777.
2016-08-01 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:
The Population Assessment of Tobacco and Health (PATH) Study began originally surveying 45,971 adult and youth respondents. The PATH Study was launched in 2011 to inform Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of tobacco users and non-users.
These 45,971 individuals constitute the first (baseline) wave of data collected by this longitudinal cohort study. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) are considered "aged-up youth" upon turning 12 years old when they are asked to join the study subsequent to parental consent. Please refer to the Public Use Files User Guide that provides further details about these children designated as "shadow youth". At each subsequent wave of data collection, the parents of sampled youth are invited to complete a short Parent Interview about his or her child(ren).
Dataset 0001 (DS0001) contains the data from the Master Linkage file. This file contains 4 variables and 53,178 cases. The file provides a master list of every person's unique identification number and what type of respondent they were for each wave.
Dataset 1001 (DS1001) contains the data from the Wave 1 Adult Questionnaire. This data file contains 1,732 variables and 32,320 cases. Each of the cases represents a single, completed interview.
Dataset 1002 (DS1002) contains the data from the Youth (and Parent) Questionnaire. This file contains 1,228 variables and 13,651 cases.
Dataset 2001 (DS2001) contains the data from the Wave 2 Adult Questionnaire. This data file contains 2,197 variables and 28,362 cases. Of these cases, 26,447 also completed a Wave 1 Adult Questionnaire. The other 1,915 cases are "aged-up adults" having previously completed a Wave 1 Youth Questionnaire.
Dataset 2002 (DS2002) contains the data from the Wave 2 Youth (and Parent) Questionnaire. This data file contains 1,389 variables and 12,172 cases. Of these cases, 10,081 also completed a Wave 1 Youth Questionnaire. The other 2,091 cases are "aged-up youth" having previously been sampled as "shadow youth."
Each case in an Adult data file represents a single, completed interview. Each case in a Youth data file represents one youth and his or her parent's responses about that youth. Parents who provided permission for their child to participate in a Youth Interview were asked to complete a brief interview about their child. In both waves of data collection, less than 0.5 percent of the parents did not complete an interview. Most questions are asked in reference to the child.
In Wave 1, about 88 percent of the "parent" respondents were the biological mother or father. When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed two or more interviews, that parent only answered questions about himself/herself once. Those questions were then skipped in the subsequent interview(s) for the other child(ren) and the responses duplicated in that child(ren)'s data file(s).
Export Citation:
Census Region
Users are reminded that these data are to be used solely for statistical analysis and reporting of aggregated information, and not for the investigation of specific individuals or organizations.
The Youth Interview and Parent Interview questionnaires were distinct and separate questionnaires used in data collection. However, for each wave, both questionnaires have been combined into a single document since the responses to these questionnaires are also combined into a single file.
Variables containing demographic and health history data about the youth were mostly collected in the Parent Interview, except for a few items that youth responded to in the Youth Interview. However, since emancipated youths have no parent in their household, they responded to all of these items for themselves as part of the Youth Interview. As a result, the variables for Youth Interview questions asked only of emancipated youths were coded as "Inapplicable" for all other youths; similarly, the corresponding Parent Interview variables were coded as "Inapplicable" for emancipated youths.
In both the Adult and Youth/Parent data files, several groups of variables contain the word "RANDOM" in both the variable name and label. This indicates computerized randomization of the question order. These "RANDOM" variables detail the order in which the questions were asked of a particular respondent.
The Wave 1 data files for both Adults and Youth, contain a section about tobacco advertising. There are 20 variable triplets contained in this section. The computer randomly selected 20 advertisements and then asked the respondents whether they had seen the ad and whether they liked the ad. The Image ID variable (_AD) identifies the advertisement that was displayed to the respondent to characterize the ad, e.g., the tobacco product and brand. However, vendors did not grant permission to publicly release the actual .jpg and .bmp files containing the images seen by respondents.
Derived and imputed demographic variables (age, sex, Hispanic ethnicity, and race) are included near the end of each data file. An accompanying imputation flag variable is also included. These variables are distinguished by the variable name starting with "R0#R" and contain the word "DERIVED" or "IMPUTED" in the variable label. Imputed variables are only available on the Wave 1 data files.
All Adult and Youth/Parent data files contain additional derived variables. These variables can be distinguished by the variable name starting with "R0#R" and contain the word "DERIVED" in the variable label. There are several variables for each tobacco category to identify certain classes of current and former tobacco users.
In accordance with the study's informed consent, information is suppressed about individuals that withdrew from the PATH Study. Their information was recoded to a special missing value, designated as -97777.
The current release contains the public-use versions of Wave 1 and Wave 2 data files. Wave 3 data files are tentatively planned to be released in 2018.
The Nonresponse Bias Analysis Report for Wave 1 details the response rates and the potential for bias from nonresponse. There is also a Nonresponse Bias Analysis Report for Wave 2.
The Informed Consent Document and Nonresponse Bias Analysis Reports are specific to each wave they are listed for. They are listed for both the Adult File and the Youth/Parent File, but they are the same files.
The questionnaires in this collection are updated versions of the fielded questionnaires that were annotated for analytic purposes. Spanish versions of the instruments are available on the restricted-use files home page.
Additional background information including answers to frequently asked questions for study participants and researchers can be found in the Researchers section of the PATH Study series page.
The Public Use Files User Guide provides an overview of the entire PATH Study. The guide covers topics such as sample design, data collection, weighting, response rates, and programming syntax to run common statistics and link the files together. Researchers should feel free to use the information in the User Guide for their publication and the guide should be cited as follows:
The data for the PATH Study was collected and prepared by Westat. The contract number under which they performed their work is: HHSN271201100027C.
The Population Assessment of Tobacco and Health (PATH) Study is a longitudinal cohort study on tobacco use behavior, attitudes and beliefs, and tobacco-related health outcomes among approximately 46,000 adults and youth in the United States. The study's primary objectives are to:
The study sampled over 150,000 mailing addresses which, using a four-staged stratified sampling design, yielded a sample of 45,971 respondents (32,320 adults / 13,651 youth) who completed a Wave 1 interview. Tobacco users and non-users who were at least 9 years old living in a civilian, non-institutionalized setting were considered for participation during Wave 1. Youth who turn 18 by the next wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) are considered "aged-up youth" upon turning 12 years old when they are asked to join the study.
The Adult files contain a single record for every adult participant. The Youth/Parent files contain a single record of every youth who participated in a given wave. Parents who provided permission for their child to complete a Youth Interview were asked to complete a brief Parent Interview that contained questions about parental supervision, school performance, and tobacco use by youth. The Parent Interview is primarily an interview about the child(ren), not the parent. In both waves, almost all youth had a parent or guardian complete the Parent Interview (over 99.5 percent). When multiple youth from the same household were selected to be in the study, the parent(s) completed separate interviews about each youth. If one parent completed multiple interviews, then questions asked about him or her were only asked once and skipped in the other interview(s). The parent's responses were then duplicated for the other child or children.
A $2 incentive was mailed to all addresses sampled at Wave 1 prior to screening. For both Wave 1 and Wave 2, adult respondents were paid $35 for their participation. Youth were paid $25 to complete the Youth Interview, and their parents were given $10 for each Parent Interview.
A four-stage stratified area probability sample design was used in the PATH Study, with a two-phase design for sampling adults at the final stage. At the first stage, a stratified sample of geographical primary sampling units (PSUs) was selected, in which a PSU is a county or group of counties. For the second stage, within each selected PSU, smaller geographical segments were formed and then a sample of these segments was drawn. At the third stage, the sampling frame consisted of the residential addresses located in these segments. The fourth stage selected adults and youth from the sampled households identified at these addresses, with varying sampling rates for adults by age, race, and tobacco use status. Adults were sampled in two phases - Phase 1 sampling used information provided in the household screener and Phase 2 sampling used information provided by the adult in the Phase 2 screener at the beginning of the Adult instrument. Please consult the Public Use Files User Guide for additional details about the sampling.
Users and non-users of tobacco products in the civilian, non-institutionalized household population of the United States aged 9 and older at the time of Wave 1.
In both waves, adults and youth were asked about the following types of tobacco products:
Although each section of tobacco products has some unique questions the majority of the questions, fit into one of the following categories:
Additional topics, in at least one wave, include:
Most questions asked in the questionnaires are categorical. Other questions ask, for example, the age at which something occurred or the person's body measurements. Responses to these questions are numerical.
The response rates for the PATH Study are shown below. The Wave 1 interview rates are conditional on completion of the Wave 1 screener. The response rates for Wave 2 are conditional on Wave 1 participation.
Please consult the Public Use Files User Guide for further information regarding the response rates of data collection.
Hide2016-08-01
2018-05-01
Wave 1 and Wave 2 Adult and Youth data files were updated to improve the clarity and consistency of variable labels, especially in the Nicotine Dependence section.
A new variable was added to Wave 1 and Wave 2 Adult data - R0#_ND_DATA_ROUTE. A second variable was added to the Wave 2 Adult data - R02R_A_P12M_BLUNTONLY_GRILLO. An additional 18 derived variables in the Wave 2 Adult data were revised and replaced the original variables. The newly named variables possess the original name, but also contain "_REV" at the end of the variable name.
A skip error was identified in the Wave 2 Adult instrument, which resulted in some respondents being asked two questions when they should not have been. Therefore, the affected items, R02_AG0100CG and R02_AG0100FC, contain some extra data. Notes were added to the annotated instrument and codebook to describe the issue.
The User Guide and Questionnaires were also updated to improve understanding of the data files. A Nonresponse Bias Analysis report is now included for Wave 2.
2018-02-15 The citation of this study may have changed due to the new version control system that has been implemented. The previous citation was:
2017-06-14 The Wave 1 data files were updated to correct minor errors along with the questionnaires to correct minor typos and clarify specifications. The Wave 2 data files, questionnaires, and codebooks were added to the study collection. Also, the Master Linkage data file was added to facilitate merging respondent records across waves. The User Guide and Master Tobacco Brand and Product Code Guide were expanded to include information about Wave 2.
2017-04-27 A minor revision was made to the Wave 1 Adult questionnaire. Two Excel crosswalks, one for Adults and one for Youth, were added to the available documentation to highlight the differences between the Wave 1 and Wave 2 files.
2017-04-03 An update was made to internal files to correct an issue with how missing values are displayed online through ICPSR's variables database.
2017-01-31 The variable R01X_CB_REGION in both the Wave 1 Adult and Youth/Parent files was updated to correct an error in the value labels. The values for codes 2 and 3 had been inadvertently swapped. The data did not change; only the value labels for codes 2 and 3 have been corrected.
2016-11-28 An additional 40 derived variables were added to the end of the Wave 1 Youth / Parent file that are similar to those already in the Wave 1 Adult file. Information for individuals who withdrew from the study is denoted in the datasets by the special missing value -97777.
2016-08-01 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:
Each data file contains weights for use in analyses of the data from the complex PATH Study sample design. The final full-sample person-level weight for each wave of the Adult file is R0#_A_PWGT, and the final full-sample person-level weight for each wave of the Youth/Parent file is R0#_Y_PWGT. There are also 100 replicate weights and design variables (VARPSU and VARSTRAT) for use in variance estimation. Detailed information on how these variables were created, and how and why they should be used is provided in the Public Use Files User Guide.
Note that the weighting procedures adjust for oversampling of specified population groups and nonresponse. ICPSR strongly recommends that researchers read and understand this section before analyzing the data to ensure correct use of these variables.
Hide