Large language models leverage external knowledge to extend clinical insight beyond language boundaries

被引:1
|
作者
Wu, Jiageng [1 ]
Wu, Xian [2 ]
Qiu, Zhaopeng [2 ]
Li, Minghui
Lin, Shixu [1 ]
Zhang, Yingying [2 ]
Zheng, Yefeng [2 ]
Yuan, Changzheng [1 ,3 ,5 ]
Yang, Jie [1 ,4 ,6 ,7 ]
机构
[1] Zhejiang Univ, Sch Med, Sch Publ Hlth, Hangzhou 310058, Peoples R China
[2] Tencent YouTu Lab, Jarvis Res Ctr, 1 Tianchen East Rd, Beijing 100101, Peoples R China
[3] Harvard TH Chan Sch Publ Hlth, Dept Nutr, Boston, MA 02115 USA
[4] Harvard Med Sch, Brigham & Womens Hosp, Dept Med, Div Pharmacoepidemiol & Pharmacoecon, Boston, MA 02115 USA
[5] Zhejiang Univ, Sch Publ Hlth, 866 Yuhangtang Rd, Hangzhou, Zhejiang, Peoples R China
[6] Brigham & Womens Hosp, Dept Med, 75 Francis St, Boston, MA 02115 USA
[7] Harvard Med Sch, 75 Francis St, Boston, MA 02115 USA
关键词
large language models; clinical knowledge; natural language processing; medical examination;
D O I
10.1093/jamia/ocae079
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives Large Language Models (LLMs) such as ChatGPT and Med-PaLM have excelled in various medical question-answering tasks. However, these English-centric models encounter challenges in non-English clinical settings, primarily due to limited clinical knowledge in respective languages, a consequence of imbalanced training corpora. We systematically evaluate LLMs in the Chinese medical context and develop a novel in-context learning framework to enhance their performance.Materials and Methods The latest China National Medical Licensing Examination (CNMLE-2022) served as the benchmark. We collected 53 medical books and 381 149 medical questions to construct the medical knowledge base and question bank. The proposed Knowledge and Few-shot Enhancement In-context Learning (KFE) framework leverages the in-context learning ability of LLMs to integrate diverse external clinical knowledge sources. We evaluated KFE with ChatGPT (GPT-3.5), GPT-4, Baichuan2-7B, Baichuan2-13B, and QWEN-72B in CNMLE-2022 and further investigated the effectiveness of different pathways for incorporating LLMs with medical knowledge from 7 distinct perspectives.Results Directly applying ChatGPT failed to qualify for the CNMLE-2022 at a score of 51. Cooperated with the KFE framework, the LLMs with varying sizes yielded consistent and significant improvements. The ChatGPT's performance surged to 70.04 and GPT-4 achieved the highest score of 82.59. This surpasses the qualification threshold (60) and exceeds the average human score of 68.70, affirming the effectiveness and robustness of the framework. It also enabled a smaller Baichuan2-13B to pass the examination, showcasing the great potential in low-resource settings.Discussion and Conclusion This study shed light on the optimal practices to enhance the capabilities of LLMs in non-English medical scenarios. By synergizing medical knowledge through in-context learning, LLMs can extend clinical insight beyond language barriers in healthcare, significantly reducing language-related disparities of LLM applications and ensuring global benefit in this field.
引用
收藏
页码:2054 / 2064
页数:11
相关论文
共 50 条
  • [1] Large language models encode clinical knowledge
    Karan Singhal
    Shekoofeh Azizi
    Tao Tu
    S. Sara Mahdavi
    Jason Wei
    Hyung Won Chung
    Nathan Scales
    Ajay Tanwani
    Heather Cole-Lewis
    Stephen Pfohl
    Perry Payne
    Martin Seneviratne
    Paul Gamble
    Chris Kelly
    Abubakr Babiker
    Nathanael Schärli
    Aakanksha Chowdhery
    Philip Mansfield
    Dina Demner-Fushman
    Blaise Agüera y Arcas
    Dale Webster
    Greg S. Corrado
    Yossi Matias
    Katherine Chou
    Juraj Gottweis
    Nenad Tomasev
    Yun Liu
    Alvin Rajkomar
    Joelle Barral
    Christopher Semturs
    Alan Karthikesalingam
    Vivek Natarajan
    Nature, 2023, 620 : 172 - 180
  • [2] Large language models encode clinical knowledge
    Singhal, Karan
    Azizi, Shekoofeh
    Tu, Tao
    Mahdavi, S. Sara
    Wei, Jason
    Chung, Hyung Won
    Scales, Nathan
    Tanwani, Ajay
    Cole-Lewis, Heather
    Pfohl, Stephen
    Payne, Perry
    Seneviratne, Martin
    Gamble, Paul
    Kelly, Chris
    Babiker, Abubakr
    Schaerli, Nathanael
    Chowdhery, Aakanksha
    Mansfield, Philip
    Demner-Fushman, Dina
    Arcas, Blaise Aguera y
    Webster, Dale
    Corrado, Greg S.
    Matias, Yossi
    Chou, Katherine
    Gottweis, Juraj
    Tomasev, Nenad
    Liu, Yun
    Rajkomar, Alvin
    Barral, Joelle
    Semturs, Christopher
    Karthikesalingam, Alan
    Natarajan, Vivek
    NATURE, 2023, 620 (7972) : 172 - +
  • [3] Enhancing Large Language Models Through External Domain Knowledge
    Welz, Laslo
    Lanquillon, Carsten
    ARTIFICIAL INTELLIGENCE IN HCI, PT III, AI-HCI 2024, 2024, 14736 : 135 - 146
  • [4] Thrust: Adaptively Propels Large Language Models with External Knowledge
    Zhao, Xinran
    Zhang, Hongming
    Pan, Xiaoman
    Yao, Wenlin
    Yu, Dong
    Chen, Jianshu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Leverage Large Language Models For Enhanced Aviation Safety
    Fox, Kevin L.
    Niewoehner, Kevin R.
    Rahmes, Mark
    Wong, Josiah
    Razdan, Rahul
    2024 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE, ICNS, 2024,
  • [6] Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators
    Chen, Liang
    Deng, Yang
    Bian, Yatao
    Qin, Zeyu
    Wu, Bingzhe
    Chua, Tat-Seng
    Wong, Kam-Fai
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6325 - 6341
  • [7] Publisher Correction: Large language models encode clinical knowledge
    Karan Singhal
    Shekoofeh Azizi
    Tao Tu
    S. Sara Mahdavi
    Jason Wei
    Hyung Won Chung
    Nathan Scales
    Ajay Tanwani
    Heather Cole-Lewis
    Stephen Pfohl
    Perry Payne
    Martin Seneviratne
    Paul Gamble
    Chris Kelly
    Abubakr Babiker
    Nathanael Schärli
    Aakanksha Chowdhery
    Philip Mansfield
    Dina Demner-Fushman
    Blaise Agüera y Arcas
    Dale Webster
    Greg S. Corrado
    Yossi Matias
    Katherine Chou
    Juraj Gottweis
    Nenad Tomasev
    Yun Liu
    Alvin Rajkomar
    Joelle Barral
    Christopher Semturs
    Alan Karthikesalingam
    Vivek Natarajan
    Nature, 2023, 620 (7973) : E19 - E19
  • [8] Empowering Large Language Models to Leverage Domain-Specific Knowledge in E-Learning
    Lu, Ruei-Shan
    Lin, Ching-Chang
    Tsao, Hsiu-Yuan
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [9] Leveraging large language models for knowledge-free weak supervision in clinical natural language processing
    Hsu, Enshuo
    Roberts, Kirk
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [10] Ten simple rules to leverage large language models for getting grants
    Seckel, Elizabeth
    Stephens, Brandi Y.
    Rodriguez, Fatima
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (03)