Leveraging large language models for generating responses to patient messages-a subjective analysis

被引:10
|
作者
Liu, Siru [1 ,5 ]
Mccoy, Allison B. [1 ]
Wright, Aileen P. [1 ,2 ]
Carew, Babatunde [3 ]
Genkins, Julian Z. [4 ]
Huang, Sean S. [1 ,2 ]
Peterson, Josh F. [1 ,2 ]
Steitz, Bryan [1 ]
Wright, Adam [1 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, Nashville, TN 37212 USA
[2] Vanderbilt Univ, Med Ctr, Dept Med, Nashville, TN 37212 USA
[3] Vanderbilt Univ, Med Ctr, Dept Gen Internal Med & Publ Hlth, Nashville, TN 37212 USA
[4] Stanford Univ, Dept Med, Stanford, CA 94304 USA
[5] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, 2525 West End Ave 1475, Nashville, TN 37212 USA
关键词
artificial intelligence; clinical decision support; large language model; patient portal; primary care;
D O I
10.1093/jamia/ocae052
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal.Materials and Methods Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness.Results The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness.Conclusion This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.
引用
收藏
页码:1367 / 1379
页数:13
相关论文
共 50 条
  • [31] LEVERAGING LARGE LANGUAGE MODELS WITH VOCABULARY SHARING FOR SIGN LANGUAGE TRANSLATION
    Lee, Huije
    Kim, Jung-Ho
    Hwang, Eui Jun
    Kim, Jaewoo
    Park, Jong C.
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [32] The effect of using a large language model to respond to patient messages
    Chen, Shan
    Guevara, Marco
    Moningi, Shalini
    Hoebers, Frank
    Elhalawani, Hesham
    Kann, Benjamin H.
    Chipidza, Fallon E.
    Leeman, Jonathan
    Aerts, Hugo J. W. L.
    Miller, Timothy
    Savova, Guergana K.
    Gallifant, Jack
    Celi, Leo A.
    Mak, Raymond H.
    Lustberg, Maryam
    Afshar, Majid
    Bitterman, Danielle S.
    LANCET DIGITAL HEALTH, 2024, 6 (06): : e379 - e381
  • [33] Using Large Language Models to Enhance Programming Error Messages
    Leinonen, Juho
    Hellas, Arto
    Sarsa, Sami
    Reeves, Brent
    Denny, Paul
    Prather, James
    Becker, Brett A.
    PROCEEDINGS OF THE 54TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, VOL 1, SIGCSE 2023, 2023, : 563 - 569
  • [34] Leveraging Large Language Models for Enhanced Classification and Analysis: Fire Incidents Case Study
    Alkhammash, Eman H.
    FIRE-SWITZERLAND, 2025, 8 (01):
  • [35] Leveraging Large Language Models for QA Dialogue Dataset Construction and Analysis in Public Services
    Wu, Chaomin
    Wu, Di
    Pan, Yushan
    Wang, Hao
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 : 56 - 68
  • [36] Generating colloquial radiology reports with large language models
    Tang, Cynthia Crystal
    Nagesh, Supriya
    Fussell, David A.
    Glavis-Bloom, Justin
    Mishra, Nina
    Li, Charles
    Cortes, Gillean
    Hill, Robert
    Zhao, Jasmine
    Gordon, Angellica
    Wright, Joshua
    Troutt, Hayden
    Tarrago, Rod
    Chow, Daniel S.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (11) : 2660 - 2667
  • [37] Extraction of Subjective Information from Large Language Models
    Kobayashi, Atsuya
    Yamaguchi, Saneyasu
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 1612 - 1617
  • [38] Leveraging large language models: transforming scholarly publishing for the better
    Fortier, Lisa A.
    AMERICAN JOURNAL OF VETERINARY RESEARCH, 2023, 84 (08) : 1 - 2
  • [39] Leveraging foundation and large language models in medical artificial intelligence
    Wong Io Nam
    Monteiro Olivia
    BaptistaHon Daniel T
    Wang Kai
    Lu Wenyang
    Sun Zhuo
    Nie Sheng
    Yin Yun
    中华医学杂志英文版, 2024, 137 (21)
  • [40] Leveraging large language models to monitor climate technology innovation
    Toetzke, Malte
    Probst, Benedict
    Feuerriegel, Stefan
    ENVIRONMENTAL RESEARCH LETTERS, 2023, 18 (09)