Methods for Analysis and Interpretation of Data Subject to Informative Visit Times [Methods Study], 2013-2018 (ICPSR 39474)

Version Date: Aug 27, 2025 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Charles E. McCulloch, University of California-San Francisco

https://doi.org/10.3886/ICPSR39474.v1

Version V1

Slide tabs to view more

Comparative effectiveness research compares two or more treatments to see which one works better for certain patients. Researchers often use data from patients' electronic health records to compare different treatments. This study addresses some problems that can arise from this practice. In some long-term research studies, researchers use data collected when patients in the studies see their doctors. Regularly scheduled doctor visits, called well visits, include yearly checkups or periodic blood pressure checks. Other doctor visits, called sick visits, occur when a patient feels sick or needs special care.

Well and sick visits can produce different types of health record data. In addition, test results at sick visits may be different from results at well visits. Using data from sick visits may inappropriately influence, or bias, a study's results. Also, patients may go to the doctor more often when they have symptoms or chronic health problems. Researchers may then collect more data from these patients than they collect from the healthier patients. Unequal amounts of data per patient make it harder to compare treatment results.

For this study, the research team created three tests to find if data from sick visits lead to bias in a study's findings. The team also compared standard and newer statistical methods for analyzing data that include sick visits. Researchers designed the newer methods to reduce bias from data obtained at sick visits. With less biased results, doctors can be more certain about which treatment worked better for certain patients.

McCulloch, Charles E. Methods for Analysis and Interpretation of Data Subject to Informative Visit Times [Methods Study], 2013-2018. Inter-university Consortium for Political and Social Research [distributor], 2025-08-27. https://doi.org/10.3886/ICPSR39474.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
Patient-Centered Outcomes Research Institute (PCORI) (ME-1306-01466)
Inter-university Consortium for Political and Social Research
Hide

2013 -- 2018
Hide

To advance statistical methodologies for analyzing data collected during regular health care. The study aims were the following:

  1. develop realistic models for outcome-dependent visit time models;
  2. use theoretical calculations and simulation models to assess bias and efficiency in longitudinal statistical analyses applied to outcome-dependent visit databases;
  3. provide guidance as to which types of statistical inferences are accurate and exhibit little bias when using databases collected with outcome-dependent visit times versus those that are likely to be inaccurate;
  4. make recommendations of how to deal with outcome-dependent visit processes in clinical research

Ordinary clinic data analyses are subject to systematic error because clinic visit frequency and timing may be related to the longitudinal outcome being studied. In this study, the research team created simulated data to assess bias introduced from outcome-dependent visit data in estimating treatment effects; the team used three standard statistical models and three newer models specifically designed to minimize this bias.

The research team proposed three tests for identifying data that may be affected by the number and timing of outcome-dependent visits. The test statistics incorporated the following patient-level data characteristics: inter-visit time, observed visit time, and total number of visits.

To develop realistic simulations, the research team interviewed a stakeholder panel of four clinician-scientists who oversee clinical databases. The team wanted to determine how regular clinic visits and outcome-dependent clinic visits are defined and to document reasons for outcome-dependent visits. A wide range of simulated conditions included variations in outcome distributions, cluster and per cluster sample sizes, degree of outcome dependence in visit data, and extent to which statistical assumptions were met or violated. Outcome-dependence simulation variations included visit probability based on an underlying health problem or outcome, subgroups with more frequent visits, and varied visit probabilities related to regularly or randomly scheduled visits.

Simulated data generated under outcome-dependent visit processes

Hide

2025-08-27

Hide

Notes

  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.