Search results

Showing 1 – 2 of 2 results.
Curated

Developing and Testing New Methods for Estimating Treatment Effectiveness in Observational Studies Using High-Dimensional Data [Methods Study], 2023 (ICPSR 39090)

Released/updated on: 2024-04-18

Propensity scores (PS) and instrumental variables (IV) are methods used to assess treatment effects in observational studies when randomized controlled trials (RCTs) are not feasible. However, these methods have limitations, especially when using high-dimensional data, or data with numerous variables or many non-linear and interaction terms. Choices on which variables and non-linear and interaction terms to include may lead to model misspecification. The objective of this study was to develop and test a set of PS and IV methods that account for model misspecification when estimating causal effects of treatments using high-dimensional data.

First, the research team created the two new methods for use with high-dimensional data. The team then used a computer program to create test data that look like real patient data. The team applied the new methods to the test data. Next, the research team applied the new methods to real data from previous studies. They applied the PS method to data from Connors et al. (1996) and applied the IV method to data used by Card (1995). Using both test and real data, the research team compared findings from the new methods with those from existing PS and IV methods and checked to see if findings from the new methods were accurate when including different patient traits and health conditions in the analysis.

This collection contains the R software package RCAL and accompanying documentation. The package source as a .tar.gz file and six different versions are available in a zipped package. Files have been released as received by ICPSR from the depositor:

  • For R version 4.2, created April 24, 2022 (Windows, r-oldrel)
  • For R version 4.3, created October 20, 2023 (Windows, r-release)
  • For R version 4.4, created March 14, 2024 (Windows, r-devel)
  • For R version 4.2, created April 1, 2023 (Mac, arm64, r-oldrel)
  • For R version 4.3, created April 6, 2023 (Mac, arm64, r-release)
  • For R version 4.3, created April 11, 2023 (Mac, x86_64, r-release)
Curated

Improving Causal Inference Methods via Statistical Learning with High-Dimensional Data [Methods Study], 2016-2021 (ICPSR 39713)

Released/updated on: 2026-03-12
Time period: 2016-01-01--2021-01-01

A randomized controlled trial, or RCT, is often the best way to learn if one treatment works better than another. RCTs assign patients to different treatments by chance. But RCTs are not always feasible. In such cases, researchers can use observational studies. In observational studies, researchers look at what happens when patients and their doctors choose the treatments. Traits such as age, gender, or health status may affect treatment choices. These traits may also affect patients' health, making it hard to know if changes in patients' health are due to treatment or to patient traits.

To figure out whether changes in patients' health result from treatment or something else, researchers use statistical methods. Two of these methods are:

  • Propensity score, or PS. PS methods compare the health of patients who have similar measured traits but received different treatments. These traits are in patient health records.
  • Instrumental variable, or IV. IV methods account for things that may affect treatment choice and patients' health but aren't in the patients' health records, such as personal preference about treatment.

But existing PS and IV methods don't work well when data sets include a lot of traits and health conditions for each patient. Such data sets are called high-dimensional data. In this study, the research team created and tested one PS method and one IV method for use with high-dimensional data.