MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning

被引:0
|
作者
Kim, Eunhui [1 ]
Jeong, Yuna [1 ]
Choi, Myung-Seok [1 ]
机构
[1] Korea Inst Sci & Technol Informat, Daejeon 34131, South Korea
关键词
Task analysis; Training; Biological system modeling; Transformers; Natural language processing; Correlation; Computational modeling; Language model; fine-tuning; domain-specific modeling; natural language processing;
D O I
10.1109/ACCESS.2023.3341612
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of large language models (LLMs) has marked a significant milestone in the evolution of natural language processing. With the expanded use of LLMs in multiple fields, the development of domain-specific pre-trained language models (PLMs) has become a natural progression and requirement. Developing domain-specific PLMs requires careful design, considering not only differences in training methods but also various factors such as the type of training data and hyperparameters. This paper proposes MediBioDeBERTa, a specialized language model (LM) for biomedical applications. First, we present several practical analyses and methods for improving the performance of LMs in specialized domains. As the initial step, we developed SciDeBERTa v2, an LM specialized in the scientific domain. In the SciERC dataset evaluation, SciDeBERTa v2 achieves the state-of-the-art model performance in the named entity recognition (NER) task. We then provide an in-depth analysis of the datasets and training methods used in the biomedical field. Based on these analyses, MediBioDeBERTa, was continually trained on SciDeBERTa v2 to specialize in the biomedical domain. Utilizing the biomedical language understanding and reasoning benchmark (BLURB), we analyzed factors that degrade task performance and proposed additional improvement methods based on intermediate fine-tuning. The results demonstrate improved performance in three categories: named entity recognition (NER), semantic similarity (SS), and question-answering (QnA), as well as in the ChemProt relation extraction (RE) task on BLURB, compared with existing state-of-the-art LMs.
引用
收藏
页码:141036 / 141044
页数:9
相关论文
共 50 条
  • [1] Fine-tuning large neural language models for biomedical natural language processing
    Tinn, Robert
    Cheng, Hao
    Gu, Yu
    Usuyama, Naoto
    Liu, Xiaodong
    Naumann, Tristan
    Gao, Jianfeng
    Poon, Hoifung
    [J]. PATTERNS, 2023, 4 (04):
  • [2] Continuous Sign Language Recognition with Iterative Spatiotemporal Fine-tuning
    Koishybay, Kenessary
    Mukushev, Medet
    Sandygulova, Anara
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10211 - 10218
  • [3] How fine can fine-tuning be? Learning efficient language models
    Radiya-Dixit, Evani
    Wang, Xin
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
  • [4] Patent classification by fine-tuning BERT language model
    Lee, Jieh-Sheng
    Hsiang, Jieh
    [J]. WORLD PATENT INFORMATION, 2020, 61
  • [5] Comprehensive Review of Large Language Model Fine-Tuning
    Zhang, Qintong
    Wang, Yuchao
    Wang, Hexi
    Wang, Junxin
    Chen, Hai
    [J]. Computer Engineering and Applications, 2024, 60 (17) : 17 - 33
  • [6] Knowledge Graph Fusion for Language Model Fine-Tuning
    Bhana, Nimesh
    van Zyl, Terence L.
    [J]. 2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 167 - 172
  • [7] Universal Language Model Fine-tuning for Text Classification
    Howard, Jeremy
    Ruder, Sebastian
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 328 - 339
  • [8] Fine-Tuning Deep Neural Networks in Continuous Learning Scenarios
    Kaeding, Christoph
    Rodner, Erik
    Freytag, Alexander
    Denzler, Joachim
    [J]. COMPUTER VISION - ACCV 2016 WORKSHOPS, PT III, 2017, 10118 : 588 - 605
  • [9] Scaling Federated Learning for Fine-Tuning of Large Language Models
    Hilmkil, Agrin
    Callh, Sebastian
    Barbieri, Matteo
    Sutfeld, Leon Rene
    Zec, Edvin Listo
    Mogren, Olof
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
  • [10] Selecting Informative Contexts Improves Language Model Fine-tuning
    Antonello, Richard
    Beckage, Nicole M.
    Turek, Javier S.
    Huth, Alexander G.
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1072 - 1085