PMC-LLaMA: toward building open-source language models for medicine

被引:13
|
作者
Wu, Chaoyi [1 ,2 ]
Lin, Weixiong [1 ,2 ]
Zhang, Xiaoman [1 ,2 ]
Zhang, Ya [1 ,2 ]
Xie, Weidi [1 ,2 ]
Wang, Yanfeng [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr CM, Shanghai 200240, Peoples R China
[2] Shanghai AI Lab, Shanghai 200232, Peoples R China
基金
国家重点研发计划;
关键词
large language models; biomedical NLP; generative language models; ChatGPT;
D O I
10.1093/jamia/ocae045
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Recently, large language models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering (QA) situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this article, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.Materials and methods We adapt a general-purpose LLM toward the medical domain, involving data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive domain-specific instruction fine-tuning, encompassing medical QA, rationale for reasoning, and conversational dialogues with 202M tokens.Results While evaluating various public medical QA benchmarks and manual rating, our lightweight PMC-LLaMA, which consists of only 13B parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, and datasets for instruction tuning will be released to the research community.Discussion Our contributions are 3-fold: (1) we build up an open-source LLM toward the medical domain. We believe the proposed PMC-LLaMA model can promote further development of foundation models in medicine, serving as a medical trainable basic generative language backbone; (2) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component, demonstrating how different training data and model scales affect medical LLMs; (3) we contribute a large-scale, comprehensive dataset for instruction tuning.Conclusion In this article, we systematically investigate the process of building up an open-source medical-specific LLM, PMC-LLaMA.
引用
收藏
页码:1833 / 1843
页数:11
相关论文
共 50 条
  • [21] On the open-source landscape of Magnetic Resonance in Medicine
    Boudreau, Mathieu
    Stikov, Nikola
    Jezzard, Peter
    MAGNETIC RESONANCE IN MEDICINE, 2022, 88 (04) : 1495 - 1497
  • [22] Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models
    Sicari, Sabrina
    Cevallos, Jesus F.M.
    Rizzardi, Alessandra
    Coen-Porisini, Alberto
    ACM Computing Surveys, 2024, 57 (04)
  • [23] Automatic structuring of radiology reports with on-premise open-source large language models
    Piotr Woźnicki
    Caroline Laqua
    Ina Fiku
    Amar Hekalo
    Daniel Truhn
    Sandy Engelhardt
    Jakob Kather
    Sebastian Foersch
    Tugba Akinci D’Antonoli
    Daniel Pinto dos Santos
    Bettina Baeßler
    Fabian Christopher Laqua
    European Radiology, 2025, 35 (4) : 2018 - 2029
  • [24] Enhancing Code Security Through Open-Source Large Language Models: A Comparative Study
    Ridley, Norah
    Branca, Enrico
    Kimber, Jadyn
    Stakhanova, Natalia
    FOUNDATIONS AND PRACTICE OF SECURITY, PT I, FPS 2023, 2024, 14551 : 233 - 249
  • [25] Iterative Refactoring of Real-World Open-Source Programs with Large Language Models
    Choi, Jinsu
    An, Gabin
    Yoo, Shin
    SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2024, 2024, 14767 : 49 - 55
  • [26] Evaluation of Open-Source Large Language Models for Metal-Organic Frameworks Research
    Bai, Xuefeng
    Xie, Yabo
    Zhang, Xin
    Han, Honggui
    Li, Jian-Rong
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (13) : 4958 - 4965
  • [27] Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain
    Ruiz, Maj Daniel C.
    Sell, John
    arXiv,
  • [28] AN OVERVIEW OF OPEN-SOURCE MODELS IN HEALTH ECONOMICS
    Michalczyk, J.
    Clay, E.
    Pochopien, M.
    Aballea, S.
    VALUE IN HEALTH, 2018, 21 : S377 - S377
  • [29] Open-source controller for dynamic cardiovascular models
    Farooq, Muhammad
    Rehman, Muhammad Riaz ur
    Vazquez, Patricia
    Wijns, William
    Shahzad, Atif
    Krasny, Marcin J.
    HARDWAREX, 2024, 17
  • [30] Open-source community supports drilling models
    Procyk, Alex
    OIL & GAS JOURNAL, 2022, 120 (05) : 40 - 43