Leveraging Large Language Models for Decision Support in Personalized Oncology

被引:63
|
作者
Benary, Manuela [1 ,2 ,3 ,4 ]
Wang, Xing David [5 ]
Schmidt, Max [1 ,2 ,3 ,6 ]
Soll, Dominik [1 ,2 ,3 ,7 ]
Hilfenhaus, Georg [1 ,2 ,3 ,8 ]
Nassir, Mani [1 ,2 ,3 ,8 ]
Sigler, Christian [1 ,2 ,3 ]
Knoedler, Maren [1 ,2 ,3 ]
Keller, Ulrich [2 ,3 ,6 ,9 ,10 ]
Beule, Dieter [4 ]
Keilholz, Ulrich [1 ,2 ,3 ,9 ,10 ]
Leser, Ulf [5 ]
Rieke, Damian T. [1 ,2 ,3 ,6 ,9 ,10 ]
机构
[1] Charite Univ Med Berlin, Comprehens Canc Ctr, Charitepl 1, D-10117 Berlin, Germany
[2] Free Univ Berlin, Berlin, Germany
[3] Humboldt Univ, Berlin, Germany
[4] Charite Univ Med Berlin, Berlin Inst Hlth, Core Unit Bioinformat, Charitepl 1, Berlin, Germany
[5] Humboldt Univ, Knowledge Management Bioinformat, Berlin, Germany
[6] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Benjamin Franklin, Berlin, Germany
[7] Charite Univ Med Berlin, Dept Endocrinol & Metab Dis, Berlin, Germany
[8] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Charite Mitte, Berlin, Germany
[9] German Canc Consortium, Berlin, Germany
[10] German Canc Res Ctr, Partner Site Berlin, Berlin, Germany
关键词
EFFICACY;
D O I
10.1001/jamanetworkopen.2023.43689
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Importance Clinical interpretation of complex biomarkers for precision oncology currently requires manual investigations of previous studies and databases. Conversational large language models (LLMs) might be beneficial as automated tools for assisting clinical decision-making.Objective To assess performance and define their role using 4 recent LLMs as support tools for precision oncology.Design, Setting, and Participants This diagnostic study examined 10 fictional cases of patients with advanced cancer with genetic alterations. Each case was submitted to 4 different LLMs (ChatGPT, Galactica, Perplexity, and BioMedLM) and 1 expert physician to identify personalized treatment options in 2023. Treatment options were masked and presented to a molecular tumor board (MTB), whose members rated the likelihood of a treatment option coming from an LLM on a scale from 0 to 10 (0, extremely unlikely; 10, extremely likely) and decided whether the treatment option was clinically useful.Main Outcomes and Measures Number of treatment options, precision, recall, F1 score of LLMs compared with human experts, recognizability, and usefulness of recommendations.Results For 10 fictional cancer patients (4 with lung cancer, 6 with other; median [IQR] 3.5 [3.0-4.8] molecular alterations per patient), a median (IQR) number of 4.0 (4.0-4.0) compared with 3.0 (3.0-5.0), 7.5 (4.3-9.8), 11.5 (7.8-13.0), and 13.0 (11.3-21.5) treatment options each was identified by the human expert and 4 LLMs, respectively. When considering the expert as a criterion standard, LLM-proposed treatment options reached F1 scores of 0.04, 0.17, 0.14, and 0.19 across all patients combined. Combining treatment options from different LLMs allowed a precision of 0.29 and a recall of 0.29 for an F1 score of 0.29. LLM-generated treatment options were recognized as AI-generated with a median (IQR) 7.5 (5.3-9.0) points in contrast to 2.0 (1.0-3.0) points for manually annotated cases. A crucial reason for identifying AI-generated treatment options was insufficient accompanying evidence. For each patient, at least 1 LLM generated a treatment option that was considered helpful by MTB members. Two unique useful treatment options (including 1 unique treatment strategy) were identified only by LLM.Conclusions and Relevance In this diagnostic study, treatment options of LLMs in precision oncology did not reach the quality and credibility of human experts; however, they generated helpful ideas that might have complemented established procedures. Considering technological progress, LLMs could play an increasingly important role in assisting with screening and selecting relevant biomedical literature to support evidence-based, personalized treatment decisions.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Leveraging Large Language Models for Sensor Data Retrieval
    Berenguer, Alberto
    Morejon, Adriana
    Tomas, David
    Mazon, Jose-Norberto
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [22] Leveraging Cognitive Science for Testing Large Language Models
    Srinivasan, Ramya
    Inakoshi, Hiroya
    Uchino, Kanji
    2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 169 - 171
  • [23] Leveraging large language models for data analysis automation
    Jansen, Jacqueline A.
    Manukyan, Artur
    Al Khoury, Nour
    Akalin, Altuna
    PLOS ONE, 2025, 20 (02):
  • [24] MicroRec: Leveraging Large Language Models for Microservice Recommendation
    Alsayed, Ahmed Saeed
    Dam, Hoa Khanh
    Nguyen, Chau
    2024 IEEE/ACM 21ST INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2024, : 419 - 430
  • [25] Leveraging Large Language Models for Navigating Brand Territory
    Luisa Fernanda Rodriguez-Sarmiento
    Vladimir Sanchez-Riaño
    Ixent Galpin
    SN Computer Science, 5 (8)
  • [26] Leveraging large language models for word sense disambiguation
    Jung H. Yae
    Nolan C. Skelly
    Neil C. Ranly
    Phillip M. LaCasse
    Neural Computing and Applications, 2025, 37 (6) : 4093 - 4110
  • [27] Leveraging Large Language Models for VNF Resource Forecasting
    Su, Jing
    Nair, Suku
    Popokh, Leo
    2024 IEEE 10TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION, NETSOFT 2024, 2024, : 258 - 262
  • [28] Leveraging Large Language Models for Effective Organizational Navigation
    Chandrasekar, Haresh
    Gupta, Srishti
    Liu, Chun-Tzu
    Tsai, Chun-Hua
    PROCEEDINGS OF THE 25TH ANNUAL INTERNATIONAL CONFERENCE ON DIGITAL GOVERNMENT RESEARCH, DGO 2024, 2024, : 1020 - 1022
  • [29] Leveraging large language models to foster equity in healthcare
    Rodriguez, Jorge A.
    Alsentzer, Emily
    Bates, David W.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)
  • [30] Leveraging Large Language Models for Clinical Abbreviation Disambiguation
    Hosseini, Manda
    Hosseini, Mandana
    Javidan, Reza
    JOURNAL OF MEDICAL SYSTEMS, 2024, 48 (01)