Leveraging Large Language Models for Decision Support in Personalized Oncology

被引:63
|
作者
Benary, Manuela [1 ,2 ,3 ,4 ]
Wang, Xing David [5 ]
Schmidt, Max [1 ,2 ,3 ,6 ]
Soll, Dominik [1 ,2 ,3 ,7 ]
Hilfenhaus, Georg [1 ,2 ,3 ,8 ]
Nassir, Mani [1 ,2 ,3 ,8 ]
Sigler, Christian [1 ,2 ,3 ]
Knoedler, Maren [1 ,2 ,3 ]
Keller, Ulrich [2 ,3 ,6 ,9 ,10 ]
Beule, Dieter [4 ]
Keilholz, Ulrich [1 ,2 ,3 ,9 ,10 ]
Leser, Ulf [5 ]
Rieke, Damian T. [1 ,2 ,3 ,6 ,9 ,10 ]
机构
[1] Charite Univ Med Berlin, Comprehens Canc Ctr, Charitepl 1, D-10117 Berlin, Germany
[2] Free Univ Berlin, Berlin, Germany
[3] Humboldt Univ, Berlin, Germany
[4] Charite Univ Med Berlin, Berlin Inst Hlth, Core Unit Bioinformat, Charitepl 1, Berlin, Germany
[5] Humboldt Univ, Knowledge Management Bioinformat, Berlin, Germany
[6] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Benjamin Franklin, Berlin, Germany
[7] Charite Univ Med Berlin, Dept Endocrinol & Metab Dis, Berlin, Germany
[8] Charite Univ Med Berlin, Dept Hematol Oncol & Canc Immunol, Campus Charite Mitte, Berlin, Germany
[9] German Canc Consortium, Berlin, Germany
[10] German Canc Res Ctr, Partner Site Berlin, Berlin, Germany
关键词
EFFICACY;
D O I
10.1001/jamanetworkopen.2023.43689
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Importance Clinical interpretation of complex biomarkers for precision oncology currently requires manual investigations of previous studies and databases. Conversational large language models (LLMs) might be beneficial as automated tools for assisting clinical decision-making.Objective To assess performance and define their role using 4 recent LLMs as support tools for precision oncology.Design, Setting, and Participants This diagnostic study examined 10 fictional cases of patients with advanced cancer with genetic alterations. Each case was submitted to 4 different LLMs (ChatGPT, Galactica, Perplexity, and BioMedLM) and 1 expert physician to identify personalized treatment options in 2023. Treatment options were masked and presented to a molecular tumor board (MTB), whose members rated the likelihood of a treatment option coming from an LLM on a scale from 0 to 10 (0, extremely unlikely; 10, extremely likely) and decided whether the treatment option was clinically useful.Main Outcomes and Measures Number of treatment options, precision, recall, F1 score of LLMs compared with human experts, recognizability, and usefulness of recommendations.Results For 10 fictional cancer patients (4 with lung cancer, 6 with other; median [IQR] 3.5 [3.0-4.8] molecular alterations per patient), a median (IQR) number of 4.0 (4.0-4.0) compared with 3.0 (3.0-5.0), 7.5 (4.3-9.8), 11.5 (7.8-13.0), and 13.0 (11.3-21.5) treatment options each was identified by the human expert and 4 LLMs, respectively. When considering the expert as a criterion standard, LLM-proposed treatment options reached F1 scores of 0.04, 0.17, 0.14, and 0.19 across all patients combined. Combining treatment options from different LLMs allowed a precision of 0.29 and a recall of 0.29 for an F1 score of 0.29. LLM-generated treatment options were recognized as AI-generated with a median (IQR) 7.5 (5.3-9.0) points in contrast to 2.0 (1.0-3.0) points for manually annotated cases. A crucial reason for identifying AI-generated treatment options was insufficient accompanying evidence. For each patient, at least 1 LLM generated a treatment option that was considered helpful by MTB members. Two unique useful treatment options (including 1 unique treatment strategy) were identified only by LLM.Conclusions and Relevance In this diagnostic study, treatment options of LLMs in precision oncology did not reach the quality and credibility of human experts; however, they generated helpful ideas that might have complemented established procedures. Considering technological progress, LLMs could play an increasingly important role in assisting with screening and selecting relevant biomedical literature to support evidence-based, personalized treatment decisions.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Exploring the Capabilities and Limitations of Large Language Models for Radiation Oncology Decision Support
    Putz, Florian
    Haderlein, Marlen
    Lettmaier, Sebastian
    Semrau, Sabine
    Fietkau, Rainer
    Huang, Yixing
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2024, 118 (04): : 900 - 904
  • [2] Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology
    Gumilar, Khanisyah Erza
    Indraprasta, Birama R.
    Faridzi, Ach Salman
    Wibowo, Bagus M.
    Herlambang, Aditya
    Rahestyningtyas, Eccita
    Irawan, Budi
    Tambunan, Zulkarnain
    Bustomi, Ahmad Fadhli
    Brahmantara, Bagus Ngurah
    Yu, Zih-Ying
    Hsu, Yu-Cheng
    Pramuditya, Herlangga
    Putra, Very Great E.
    Nugroho, Hari
    Mulawardhana, Pungky
    Tjokroprawiro, Brahmana A.
    Hedianto, Tri
    Ibrahim, Ibrahim H.
    Huang, Jingshan
    Lij, Dongqi
    Lu, Chien-Hsing
    Yang, Jer-Yen
    Liao, Li-Na
    Tan, Ming
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 : 4019 - 4026
  • [3] Leveraging Large Language Models for Generating Personalized Care Recommendations in Dementia
    Hu, Hsiang-Wei
    Lin, Yu-chun
    Chia, Chang-Hung
    Chuang, Ethan
    Yang, Cheng Ru
    2024 IEEE INTERNATIONAL WORKSHOP ON ELECTROMAGNETICS: APPLICATIONS AND STUDENT INNOVATION COMPETITION, IWEM 2024, 2024,
  • [4] LEVERAGING LARGE LANGUAGE MODELS FOR CONCEPTUALIZING HEALTH ECONOMIC MODELS: A FEASIBILITY STUDY IN ONCOLOGY
    Srivastava, T.
    Swami, S.
    Tong, T.
    VALUE IN HEALTH, 2024, 27 (12)
  • [5] Leveraging Large Language Models to Support Authoring Gamified Programming Exercises
    Montella, Raffaele
    De Vita, Ciro Giuseppe
    Mellone, Gennaro
    Ciricillo, Tullio
    Caramiello, Dario
    Di Luccio, Diana
    Kosta, Sokol
    Damasevicius, Robertas
    Maskeliunas, Rytis
    Queiros, Ricardo
    Swacha, Jakub
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [6] Leveraging large language models in dermatology
    Matin, Rubeta N.
    Linos, Eleni
    Rajan, Neil
    BRITISH JOURNAL OF DERMATOLOGY, 2023, 189 (03) : 253 - 254
  • [7] Large language models for precision oncology: Clinical decision support through expert-guided learning.
    Lammert, Jacqueline
    Dreyer, Tobias F.
    Loersch, Alisa M.
    Jung, Johannes
    Lange, Sebastian
    Pfarr, Nicole
    Durner, Anna
    Kiechle, Marion B.
    Schatz, Ulrich A.
    Mathes, Sonja
    Schwamborn, Kristina
    Winter, Christof
    Mogler, Carolin
    Illert, Anna Lena
    Tschochohei, Maximilian
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [8] Large language models present new questions for decision support
    Handler, Abram
    Larsen, Kai R.
    Hackathorn, Richard
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2024, 79
  • [9] Decision support systems for personalized and participative radiation oncology
    Lambin, Philippe
    Zindler, Jaap
    Vanneste, Ben G. L.
    Van De Voorde, Lien
    Eekers, Danielle
    Compter, Inge
    Panth, Kranthi Marella
    Peerlings, Jurgen
    Larue, Ruben T. H. M.
    Deist, Timo M.
    Jochems, Arthur
    Lustberg, Tim
    van Soest, Johan
    de Jong, Evelyn E. C.
    Even, Aniek J. G.
    Reymen, Bart
    Rekers, Nicolle
    van Gisbergen, Marike
    Roelofs, Erik
    Carvalho, Sara
    Leijenaar, Ralph T. H.
    Zegers, Catharina M. L.
    Jacobs, Maria
    van Timmeren, Janita
    Brouwers, Patricia
    Lal, Jonathan A.
    Dubois, Ludwig
    Yaromina, Ala
    Van Limbergen, Evert Jan
    Berbee, Maaike
    van Elmpt, Wouter
    Oberije, Cary
    Ramaekers, Bram
    Dekker, Andre
    Boersma, Liesbeth J.
    Hoebers, Frank
    Smits, Kim M.
    Berlanga, Adriana J.
    Walsh, Sean
    ADVANCED DRUG DELIVERY REVIEWS, 2017, 109 : 131 - 153
  • [10] Large Language Models as Decision Support Tools for Mood Disorder Pharmacotherapy
    Perlis, Roy
    NEUROPSYCHOPHARMACOLOGY, 2024, 49 : 7 - 8