Data augmentation based on large language models for radiological report classification

被引:0
|
作者
Collado-Montañez, Jaime [1 ]
Martín-Valdivia, María-Teresa [1 ]
Martínez-Cámara, Eugenio [1 ]
机构
[1] Computer Science Department, University of Jaén, Campus Las Lagunillas s/n, Jaén,23071, Spain
关键词
Radiology;
D O I
10.1016/j.knosys.2024.112745
中图分类号
学科分类号
摘要
The International Classification of Diseases (ICD) is fundamental in the field of healthcare as it provides a standardized framework for the classification and coding of medical diagnoses and procedures, enabling the understanding of international public health patterns and trends. However, manually classifying medical reports according to this standard is a slow, tedious and error-prone process, which shows the need for automated systems to offload the healthcare professional of this task and to reduce the number of errors. In this paper, we propose an automated classification system based on Natural Language Processing to analyze radiological reports and classify them according to the ICD-10. Since the specialized use of the language of radiological reports and the usual unbalanced distribution of medical report sets, we propose a methodology grounded in leveraging large language models for augmenting the data of unrepresented classes and adapting the classification language models to the specific use of the language of radiological reports. The results show that the proposed methodology enhances the classification performance on the CARES corpus of radiological reports. © 2024 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
    Sahu, Gaurav
    Rodriguez, Pau
    Laradji, Issam H.
    Atighehchian, Parmida
    Vazquez, David
    Bandanau, Dzmitry
    [J]. PROCEEDINGS OF THE 4TH WORKSHOP ON NLP FOR CONVERSATIONAL AI, 2022, : 47 - 57
  • [2] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    [J]. ELECTRONICS, 2024, 13 (13)
  • [3] Retrieval augmentation of large language models for lay language generation
    Guo, Yue
    Qiu, Wei
    Leroy, Gondy
    Wang, Sheng
    Cohen, Trevor
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149
  • [4] Retrieval augmentation of large language models for lay language generation
    Guo, Yue
    Qiu, Wei
    Leroy, Gondy
    Wang, Sheng
    Cohen, Trevor
    [J]. Journal of Biomedical Informatics, 2024, 149
  • [5] CALLM: Enhancing Clinical Interview Analysis Through Data Augmentation with Large Language Models
    Wu, Yuqi
    Mao, Kaining
    Zhang, Yanbo
    Chen, Jie
    [J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28 (12) : 7531 - 7542
  • [6] LLMRec: Large Language Models with Graph Augmentation for Recommendation
    Wei, Wei
    Ren, Xubin
    Tang, Jiabin
    Wang, Qinyong
    Su, Lixin
    Cheng, Suqi
    Wang, Junfeng
    Yin, Dawei
    Huang, Chao
    [J]. PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 806 - 815
  • [7] Data Augmentation for Spoken Language Understanding via Pretrained Language Models
    Peng, Baolin
    Zhu, Chenguang
    Zeng, Michael
    Gao, Jianfeng
    [J]. INTERSPEECH 2021, 2021, : 1219 - 1223
  • [8] LiDA: Language-Independent Data Augmentation for Text Classification
    Sujana, Yudianto
    Kao, Hung-Yu
    [J]. IEEE ACCESS, 2023, 11 : 10894 - 10901
  • [9] Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models
    Sheik, Reshma
    Sundara, K. P. Siva
    Nirmala, S. Jaya
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [10] Neural Data Augmentation for Legal Overruling Task: Small Deep Learning Models vs. Large Language Models
    Reshma Sheik
    K. P. Siva Sundara
    S. Jaya Nirmala
    [J]. Neural Processing Letters, 56