Development of Computational Methods for Evaluating Doctor-Patient Communication [Methods Study], United States, 2016-2021 (ICPSR 39720)

Name: Development of Computational Methods for Evaluating Doctor-Patient Communication [Methods Study], United States, 2016-2021
Published: 2026-03-18
License: https://www.icpsr.umich.edu/web/ICPSR/studies/39720/terms

Version Date: Mar 18, 2026 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Zac E. Imel, University of Utah

https://doi.org/10.3886/ICPSR39720.v1

Version V1

Slide tabs to view more

Summary View help for Summary

The way doctors communicate with patients during office visits can affect the quality of care. Studying conversations between doctors and patients can help doctors improve their communication skills.

To study conversations, researchers rely on written records, or transcripts, of office visits. They read the transcripts and give each conversation topic a label. For example, topics may include smoking or pain. But labeling topics in this way may take a lot of time.

In this project, the research team created and tested a new method to make this work easier using natural language processing, or NLP. With NLP, computer programs interpret written language. NLP methods use a process called machine learning, where computer programs use data to learn how to perform different tasks with little or no human input.

Citation View help for Citation

Imel, Zac E. Development of Computational Methods for Evaluating Doctor-Patient Communication [Methods Study], United States, 2016-2021. Inter-university Consortium for Political and Social Research [distributor], 2026-03-18. https://doi.org/10.3886/ICPSR39720.v1

Export Citation:

RIS (generic format for RefWorks, EndNote, etc.)
EndNote

Funding View help for Funding

Patient-Centered Outcomes Research Institute (PCORI) (ME-1602-34167)

Subject Terms View help for Subject Terms

artificial intelligence communications systems learning patient care

Geographic Coverage View help for Geographic Coverage

United States

Distributor(s) View help for Distributor(s)

Inter-university Consortium for Political and Social Research

Hide

Time Period(s) View help for Time Period(s)

2016 -- 2021

Hide

Study Purpose View help for Study Purpose

The specific aims were to develop and evaluate natural language processing (NLP) models that predict (1) topics of conversations and (2) emotional valence of patient-provider interactions.

Study Design View help for Study Design

To develop NLP machine learning algorithms and models, researchers first trained the NLP algorithms to label topics in patient-clinician conversations. Researchers used 279 transcripts of patient-clinician conversations from two studies that already had 36 manually assigned topic labels, such as physical examination, cigarette use, or pain. The NLP algorithms learned to associate specific words in a conversation with the manually assigned topic labels and then predict the labels for other transcripts based on those associations.

Next, researchers developed three types of NLP classification models called non-sequential, window-based, and sequential. Each type of model used a different statistical method to label topics in the transcripts.

Researchers then evaluated the models' accuracy in labeling topics compared with the manually assigned labels in the same transcripts. They also compared each model's topic labels with a baseline NLP model that labeled the most common topics.

A patient advisory board provided input throughout the study, including how to explain complex research methods in a clear way.

Data Source View help for Data Source

Transcripts of audio recordings of patient-provider interactions from the Mental Health Discussion (MHD) study and transcripts of video recordings from the Assessment of Doctor-Elderly Patient Encounters (ADEPT) study

Hide

Original Release Date View help for Original Release Date

2026-03-18

Hide

Notes

The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.
ICPSR usually offers files in multiple formats for researchers to be able to access data and documentation in formats that work well within their needs. If you have questions about the accessibility of materials distributed by ICPSR or require further assistance, please visit ICPSR’s Accessibility Center.

Development of Computational Methods for Evaluating Doctor-Patient Communication [Methods Study], United States, 2016-2021 (ICPSR 39720)

Project Description

Summary View help for Summary

Citation View help for Citation

Funding View help for Funding

Subject Terms View help for Subject Terms

Geographic Coverage View help for Geographic Coverage

Distributor(s) View help for Distributor(s)

Scope of Project

Time Period(s) View help for Time Period(s)

Methodology

Study Purpose View help for Study Purpose

Study Design View help for Study Design

Data Source View help for Data Source

Version(s)

Original Release Date View help for Original Release Date

Notes