Towards a New Generation of Matching Methods for Comparative Effectiveness Research [Methods Study], Chile and United States, 2008-2023 (ICPSR 39744)
Version Date: Mar 23, 2026 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
José R. Zubizarreta, Harvard University
https://doi.org/10.3886/ICPSR39744.v1
Version V1
Summary View help for Summary
Comparative effectiveness research compares two or more treatments to see which one works better for which patients. When researchers can't assign patients by chance to treatments, they can use observational studies. In observational studies, researchers use data like health records to compare treatment effects. But it can be hard to know if the effects are due to the treatment or to patient traits, like age.
To address this issue, researchers can use statistical methods called propensity score matching, or PSM. With PSM, researchers create groups of patients for analysis who have received different treatments. They match patients with similar traits across groups. This method reduces bias when comparing treatments. But current PSM methods don't work well or may take many hours when comparing three or more treatments or when using large data sets.
In this study, the research team created and tested a new method for matching patients from large data sets to compare the effects of three or more treatments.
Citation View help for Citation
Export Citation:
Funding View help for Funding
Subject Terms View help for Subject Terms
Geographic Coverage View help for Geographic Coverage
Distributor(s) View help for Distributor(s)
Study Purpose View help for Study Purpose
To develop and test a new matching method for comparing three or more treatments using large data sets
Study Design View help for Study Design
First, the research team developed a new matching method, based on a statistical formula called linear-sized mixed integer programming (MIP), to estimate the effects of three or more treatments using patient data from large data sets. Instead of matching patients across treatment groups, the new method matches each treatment group separately to a representative group from the intended population.
Using simulated data, the research team compared the new method with existing PSM methods. The team measured the computing time it took to match groups of varying sizes and treatments.
Next, the research team tested the new method with observational data collected in 2008 and 2010 from 121,279 high school students exposed to a 2010 earthquake in Chile. The team created three measures of earthquake exposure--one with 3 levels of peak ground acceleration shaking, one with 5 levels, and one with 10 levels of shaking. The team examined covariate balance across matched groups for different levels of earthquake exposure.
Patients, clinicians, and researchers provided input during the study.
Data Source View help for Data Source
Developing statistical method: linear-sized matching formulation
Testing new method: simulation studies, empirical analysis
Notes
The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.
ICPSR usually offers files in multiple formats for researchers to be able to access data and documentation in formats that work well within their needs. If you have questions about the accessibility of materials distributed by ICPSR or require further assistance, please visit ICPSR’s Accessibility Center.
