Measuring and Talking to Patients About the Accuracy of Data Used in Patient-Centered Outcomes Research [Methods Study], North Carolina and Arkansas, 2013-2018 (ICPSR 39515)
Version Date: Oct 14, 2025 View help for published
Principal Investigator(s): View help for Principal Investigator(s)
Anita Walden, Oregon Health and Science University
https://doi.org/10.3886/ICPSR39515.v1
Version V1
Summary View help for Summary
For research studies, researchers can use data about patients' health and treatments from electronic health records, or EHRs. They may also collect self-reported data directly from patients. But a patient's EHR and self-reported data may not always agree. For example, differences may exist between the medicines that patients report taking and the medicines listed in their EHRs. Researchers don't know which of these two data sources is the most accurate.
In this project, the research team looked at EHR and self-reported data to learn which data source was more accurate.
Citation View help for Citation
Export Citation:
Funding View help for Funding
Subject Terms View help for Subject Terms
Geographic Coverage View help for Geographic Coverage
Distributor(s) View help for Distributor(s)
Study Purpose View help for Study Purpose
The aim of this study included 4 components: (1) to describe the level of agreement between SR medical conditions, procedures, hospitalizations, and smoking status and those documented in EHRs; (2) to describe the areas of disagreement; (3) to determine which data source is most accurate and under what conditions; and (4) to describe the errors found in each data source and their root causes.
Study Design View help for Study Design
The research team enrolled and obtained EHR records and self-reported data for 5,900 adult patients receiving care at primary care and specialty clinics and hospitals in North Carolina and Arkansas between December 2018 and December 2019. In North Carolina, self-reported data came from the MURDOCK Community registry. In Arkansas, the team collected self-reported data directly from patients.
To measure agreement between the EHR data and self-reported data, the research team linked the data sources and conducted statistical analysis to compare data on 45 items, including 34 medical conditions, 8 procedures, hospitalizations, and smoking status.
To assess the accuracy of each data source, the research team selected and interviewed 610 patients who had discrepancies between their EHR and self-reported data to identify the cause of each discrepancy. Except for age and ethnicity, interviewed patients were representative of the larger sample of study participants. The team created a reference data set for these patients by replacing data with confirmed data from the interviews. The team then compared each item in the EHR and self-reported data with the reference data set to calculate percent agreement, sensitivity, and specificity.
Patients and doctors gave input on the design and conduct of the study.
Universe View help for Universe
Adult patients who received treatment between December 2018 and December 2019 from clinics and hospitals in North Carolina and Arkansas
Data Source View help for Data Source
EHR and self-reported data from 5,900 adult patients, ages 18 and older, who received treatment between December 2018 and December 2019 from clinics and hospitals in North Carolina and Arkansas
Notes
The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.
ICPSR usually offers files in multiple formats for researchers to be able to access data and documentation in formats that work well within their needs. If you have questions about the accessibility of materials distributed by ICPSR or require further assistance, please visit ICPSR’s Accessibility Center.
