Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files (ICPSR 38008)

Version Date: Apr 27, 2021 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
United States Department of Health and Human Services. National Institutes of Health. National Institute on Drug Abuse; United States Department of Health and Human Services. Food and Drug Administration. Center for Tobacco Products

Series:

https://doi.org/10.3886/ICPSR38008.v1

Version V1 ()

  • V15 [2024-10-11]
  • V14 [2024-06-14] unpublished
  • V13 [2024-04-08] unpublished
  • V12 [2023-12-15] unpublished
  • V11 [2023-09-18] unpublished
  • V9 [2023-03-31] unpublished
  • V8 [2022-12-16] unpublished
  • V7 [2022-10-07] unpublished
  • V6 [2022-05-11] unpublished
  • V5 [2022-04-21] unpublished
  • V4 [2021-12-16] unpublished
  • V3 [2021-09-29] unpublished
  • V2 [2021-06-03] unpublished
  • V1 [2021-04-27] unpublished

You are currently viewing an older version of this data collection. A more recent version may be available by selecting ()

Additional information about this collection can be found in Version History.

2021-04-27 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

Slide tabs to view more

PATH Study (MLF)

The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of tobacco users and non-users.

45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled PSUs and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1 and Wave 4 Cohorts.

Dataset 0001 (DS0001) contains the data from the Public-Use Master Linkage File (PUF-MLF). This file contains 30 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files.

Dataset 0002 (DS0002) contains the data from the Restricted-Use Master Linkage File (RUF-MLF). This file contains 93 variables and 67,276 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.

United States Department of Health and Human Services. National Institutes of Health. National Institute on Drug Abuse, and United States Department of Health and Human Services. Food and Drug Administration. Center for Tobacco Products. Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files. Inter-university Consortium for Political and Social Research [distributor], 2021-04-27. https://doi.org/10.3886/ICPSR38008.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
United States Department of Health and Human Services. National Institutes of Health. National Institute on Drug Abuse, United States Department of Health and Human Services. Food and Drug Administration. Center for Tobacco Products

None

Users are reminded that these data are to be used solely for statistical analysis and reporting of aggregated information, and not for the investigation of specific individuals or organizations.

Access to the RUF-MLF data is restricted. Users interested in obtaining these data must complete a Restricted Data Use Agreement. Data are provided via ICPSR's Virtual Data Enclave (VDE). Apply for access to these data through the ICPSR VDE portal. Information and instructions are available within the data portal. For further assistance please reference the VDE Guide to learn about the application process, about using the VDE, and how to request disclosure review of VDE output.

Inter-university Consortium for Political and Social Research
Hide

2013 -- 2014 (Wave 1), 2014 -- 2015 (Wave 2), 2015 -- 2016 (Wave 3), 2016 -- 2018 (Wave 4), 2017 -- 2018 (Wave 4.5), 2018 -- 2019 (Wave 5)
2013-09 -- 2014-12 (Wave 1), 2014-10 -- 2015-10 (Wave 2), 2015-10 -- 2016-10 (Wave 3), 2016-12 -- 2018-01 (Wave 4), 2017-12 -- 2018-12 (Wave 4.5), 2018-12 -- 2019-11 (Wave 5)
  1. The PATH Study Data User Forum allows researchers using any PATH Study data files to communicate with each other to ask and answer questions. Announcements, data releases and updates, new publications, upcoming events, and other information for PATH Study data users will also be posted to the forum.

  2. The PUF-MLF is available for access by the general public. For the RUF-MLF, data are provided via ICPSR's Virtual Data Enclave (VDE) where researchers will work with data stored on secure ICPSR servers. Researchers will not possess actual physical copies of the data; however, they may request permission to access selected output outside the virtual environment after review by ICPSR. See the Access Notes to apply for access. Researchers are also encouraged to read the VDE Guide.

  3. The data files contain person-level (PERSONID) across waves of data collection. The PERSONID values are random and contain no direct or indirect personally identifiable information. Chapter 7 in the Public-Use Files User Guide contains information about linking data available for public-use. Appendix E in the Restricted-Use Files User Guide also contains information and programming code on linking files together. The files are sorted by the variable PERSONID.

  4. The PUF-MLF includes indicator variables for the availability of interview data and weights for each participant. It also includes variables that indicate availability of biospecimens through the Biospecimen Access Program (BAP). The PUF-MLF can help analysts identify which Public-Use files contain data for a particular participant (or set of participants).

  5. The RUF-MLF includes indicator variables for the availability of interview data, weights, state identifier data, tobacco Universal Product Code (UPC) data, and biomarker data for each participant. It also include variables that indicate availability of biospecimens through the the Biospecimen Access Program (BAP). The RUF-MLF can help analysts identify which Restricted-Use files contain data for a particular participant (or set of participants).

  6. The RUF-MLF and PUF-MLF will be extended as new data are released in each respective collection.

  7. The PATH Study's documentation is available for your use and may be reproduced in whole or in part without permission from NIH's National Institute on Drug Abuse or FDA's Center for Tobacco Products. Citation of the source is appreciated.

  8. Additional background information including answers to frequently asked questions for study participants and researchers can be found in the Researchers section of the PATH Study Series page.

  9. There are a variety of user guides available that describe the PATH Study as well as the use of specific types of data. Researchers can access the user guides on the PATH Study Series page or through the various collections: Restricted-Use Files, Public-Use Files, Special Collection Restricted-Use Files, Special Collection Public-Use Files, or Biomarker Restricted-Use Files.

  10. 2021-04-27 Latest versions of RUF-MLF and PUF-MLF were added to the collection, consolidating the various MLFs that were in each collection: Restricted-Use Files, Public-Use Files, Special Collection Restricted-Use Files, Special Collection Public-Use Files, or Biomarker Restricted-Use Files.

  11. The data for the PATH Study was collected and prepared by Westat. The contract numbers under which they performed their work are HHSN271201100027C and HHSN271201600001C.

Hide

The Population Assessment of Tobacco and Health (PATH) Study is a longitudinal cohort study on tobacco use behavior, attitudes and beliefs, and tobacco-related health outcomes among approximately adults and youth in the United States. The study's primary objectives are to:

  • Objective 1: Identify and explain between-person differences and within-person changes in tobacco-use patterns, including the rate and length of use by specific product type and brand, product/brand switching over time, uptake of new products, and dual- and poly-use of tobacco products (i.e., use of multiple products within the same time period and switching between multiple products).
  • Objective 2: Identify between-person differences and within-person changes in risk perceptions regarding harmful and potentially harmful constituents, new and emerging tobacco products, filters and other design features of tobacco products, packaging, and labeling; and identify other factors that may affect use, such as social influences and individual preferences.
  • Objective 3: Characterize the natural history of tobacco dependence, cessation, and relapse, including readiness and self-efficacy to quit, motivations for quitting, the number and length of quit attempts, and the length of abstinence related to various tobacco products.
  • Objective 4: Update the comprehensive baseline and subsequent waves of data on tobacco-use behaviors and related health conditions, including markers of exposure and tobacco-related disease processes identified from the collection and analysis of biospecimens, to assess between-person differences and within-person changes over time in health conditions potentially related to tobacco use, particularly with use of new and different tobacco products, including modified-risk tobacco products.
  • Objective 5: Assess associations between TCA-specific actions and tobacco-product use, risk perceptions and attitudes, use patterns, cessation outcomes, and tobacco-related intermediate endpoints (e.g., biomarkers of exposure and biomarkers related to disease). Analyses will attempt to account for other potential factors, such as demographics, local tobacco-control policies, and social, familial, and economic factors, that may influence the observed patterns.
  • Objective 6: Assess between-person differences and within-person changes over time in attitudes, behaviors, exposure to tobacco products, and related biomarkers among and within population sub-groups identified by such characteristics as race-ethnicity, gender, and/or age, or by risk factors, such as pregnancy or co-occurring substance use or mental health disorders.
  • Objective 7: To the extent to which sample sizes are sufficient, assess and compare samples of former and never users of tobacco products for between-person differences and within-person changes in relapse and uptake, risk perceptions, and indicators of tobacco exposure and disease processes.
  • Objective 8: Use data from the PATH Study's baseline and follow-up waves on tobacco-use behaviors, attitudes, and related health conditions, including potential markers of exposure and related disease processes identified from the analysis of biospecimens, to screen and subsample respondents for participation in formative and/or nested studies conducted during and after the PATH Study's waves of data and biospecimen collection.

At Wave 1, the study sampled over 150,000 mailing addresses which, using a four-staged stratified sampling design, yielded a sample of 45,971 respondents (32,320 adults / 13,651 youth) who completed a Wave 1 interview. Tobacco users and non-users who were at least 9 years old living in a civilian, non-institutionalized setting were considered for participation during Wave 1. Youth who turn 18 by the next wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) are considered "aged-up youth" upon turning 12 years old when they are asked to join the study. These 53,178 participants form the Wave 1 Cohort.

At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from close to 174,000 mailing addresses not selected for Wave 1, in the same sampled PSUs and segments using similar within-household sampling procedures. To meet the needs for the Wave 4 Cohort shadow sample, a randomly selected subset of the sampled addresses (115,500 or close to two-thirds of the addresses) were screened solely to identify shadow youth ages 10 to 11. The remaining addresses (close to 58,500) were screened for adults, youth, and shadow youth ages 10 to 11. These are referred to as the "SO" (shadow youth only) and "AYS" (adults, youth, and shadow youth) replenishment samples, respectively. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort.

A four-stage stratified area probability sample design was used in the PATH Study, with a two-phase design for sampling adults at the final stage. At the first stage, a stratified sample of geographical primary sampling units (PSUs) was selected, in which a PSU is a county or group of counties. For the second stage, within each selected PSU, smaller geographical segments were formed and then a sample of these segments was drawn. At the third stage, the sampling frame consisted of the residential addresses located in these segments. The fourth stage selected adults and youth from the sampled households identified at these addresses, with varying sampling rates for adults by age, race, and tobacco use status. Adults were sampled in two phases - Phase 1 sampling used information provided in the household screener and Phase 2 sampling used information provided by the adult in the Phase 2 screener at the beginning of the Adult instrument. Please consult thePublic-Use Files User Guide or Restricted-Use Files User Guide for additional details about the sampling.

Longitudinal: Panel

Users and non-users of tobacco products in the civilian, non-institutionalized household population of the United States aged 9 and older at the time of Wave 1 (Wave 1 Cohort); Users and non-users of tobacco products in the civilian, non-institutionalized household population of the United States aged 10 and older at the time of Wave 4 (Wave 4 Cohort)

individual

In the PUF-MLF, indicator variables that identify the availability of interview data, weights, and biospecimens (through the BAP) for each participant (or set of participants) with Public-Use data.

In the RUF-MLF, indicator variables that identify the availability of interview data, weights, biomarker data, and biospecimens (through the BAP) for each participant (or set of participants) with Restricted-Use data.

Hide

2021-04-27

2021-04-27 ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection:

  • Checked for undocumented or out-of-range codes.

Hide

There are no weights associated with the Master Linkage Files.

Hide