MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning

被引：0

作者：

Kim, Eunhui ^{[1
]}

Jeong, Yuna ^{[1
]}

Choi, Myung-Seok ^{[1
]}

机构：

[1] Korea Inst Sci & Technol Informat, Daejeon 34131, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Task analysis; Training; Biological system modeling; Transformers; Natural language processing; Correlation; Computational modeling; Language model; fine-tuning; domain-specific modeling; natural language processing;

D O I：

10.1109/ACCESS.2023.3341612

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The emergence of large language models (LLMs) has marked a significant milestone in the evolution of natural language processing. With the expanded use of LLMs in multiple fields, the development of domain-specific pre-trained language models (PLMs) has become a natural progression and requirement. Developing domain-specific PLMs requires careful design, considering not only differences in training methods but also various factors such as the type of training data and hyperparameters. This paper proposes MediBioDeBERTa, a specialized language model (LM) for biomedical applications. First, we present several practical analyses and methods for improving the performance of LMs in specialized domains. As the initial step, we developed SciDeBERTa v2, an LM specialized in the scientific domain. In the SciERC dataset evaluation, SciDeBERTa v2 achieves the state-of-the-art model performance in the named entity recognition (NER) task. We then provide an in-depth analysis of the datasets and training methods used in the biomedical field. Based on these analyses, MediBioDeBERTa, was continually trained on SciDeBERTa v2 to specialize in the biomedical domain. Utilizing the biomedical language understanding and reasoning benchmark (BLURB), we analyzed factors that degrade task performance and proposed additional improvement methods based on intermediate fine-tuning. The results demonstrate improved performance in three categories: named entity recognition (NER), semantic similarity (SS), and question-answering (QnA), as well as in the ChemProt relation extraction (RE) task on BLURB, compared with existing state-of-the-art LMs.

引用

页码：141036 / 141044

页数：9

共 50 条

[1] Fine-tuning large neural language models for biomedical natural language processing
Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
[J]. PATTERNS, 2023, 4 (04):
[2] Continuous Sign Language Recognition with Iterative Spatiotemporal Fine-tuning
Koishybay, Kenessary
Mukushev, Medet
Sandygulova, Anara
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10211 - 10218
[3] How fine can fine-tuning be? Learning efficient language models
Radiya-Dixit, Evani
Wang, Xin
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
[4] Patent classification by fine-tuning BERT language model
Lee, Jieh-Sheng
Hsiang, Jieh
[J]. WORLD PATENT INFORMATION, 2020, 61
[5] Comprehensive Review of Large Language Model Fine-Tuning
Zhang, Qintong
Wang, Yuchao
Wang, Hexi
Wang, Junxin
Chen, Hai
[J]. Computer Engineering and Applications, 2024, 60 (17) : 17 - 33
[6] Knowledge Graph Fusion for Language Model Fine-Tuning
Bhana, Nimesh
van Zyl, Terence L.
[J]. 2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 167 - 172
[7] Universal Language Model Fine-tuning for Text Classification
Howard, Jeremy
Ruder, Sebastian
[J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 328 - 339
[8] Fine-Tuning Deep Neural Networks in Continuous Learning Scenarios
Kaeding, Christoph
Rodner, Erik
Freytag, Alexander
Denzler, Joachim
[J]. COMPUTER VISION - ACCV 2016 WORKSHOPS, PT III, 2017, 10118 : 588 - 605
[9] Scaling Federated Learning for Fine-Tuning of Large Language Models
Hilmkil, Agrin
Callh, Sebastian
Barbieri, Matteo
Sutfeld, Leon Rene
Zec, Edvin Listo
Mogren, Olof
[J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
[10] Selecting Informative Contexts Improves Language Model Fine-tuning
Antonello, Richard
Beckage, Nicole M.
Turek, Javier S.
Huth, Alexander G.
[J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1072 - 1085

← 1 2 3 4 5 →