Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods

被引:3
|
作者
Chng, Seo Yi [1 ]
Tern, Paul J. W. [2 ]
Kan, Matthew R. X. [3 ]
Cheng, Lionel T. E. [4 ]
机构
[1] Natl Univ Singapore, Dept Paediat, 5 Lower Kent Ridge Rd, Singapore 119074, Singapore
[2] Natl Heart Ctr, Dept Cardiol, Singapore, Singapore
[3] NUS High Sch Math & Sci, Singapore, Singapore
[4] Singapore Gen Hosp, Dept Diagnost Radiol, Singapore, Singapore
来源
HEALTH CARE SCIENCE | 2023年 / 2卷 / 02期
关键词
automated labelling; machine learning; natural language processing; neural network; radiology;
D O I
10.1002/hcs2.40
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Automated labelling of radiology reports using natural language processing allows for the labelling of ground truth for large datasets of radiological studies that are required for training of computer vision models. This paper explains the necessary data preprocessing steps, reviews the main methods for automated labelling and compares their performance. There are four main methods of automated labelling, namely: (1) rules-based text-matching algorithms, (2) conventional machine learning models, (3) neural network models and (4) Bidirectional Encoder Representations from Transformers (BERT) models. Rules-based labellers perform a brute force search against manually curated keywords and are able to achieve high F1 scores. However, they require proper handling of negative words. Machine learning models require preprocessing that involves tokenization and vectorization of text into numerical vectors. Multilabel classification approaches are required in labelling radiology reports and conventional models can achieve good performance if they have large enough training sets. Deep learning models make use of connected neural networks, often a long short-term memory network, and are similarly able to achieve good performance if trained on a large data set. BERT is a transformer-based model that utilizes attention. Pretrained BERT models only require fine-tuning with small data sets. In particular, domain-specific BERT models can achieve superior performance compared with the other methods for automated labelling. There are four main methods employed in the automated labelling of radiology reports, namely: (1) rules-based text-matching algorithms, (2) conventional machine learning models, (3) neural network models and (4) Bidirectional Encoder Representations from Transformers (BERT) models. This paper explains the necessary data preprocessing steps, reviews the main methods for automated labelling and compares their performance. image
引用
收藏
页码:120 / 128
页数:9
相关论文
共 50 条
  • [1] Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models
    Langenbach, Marcel C.
    Foldyna, Borek
    Hadzic, Ibrahim
    Langenbach, Isabel L.
    Raghu, Vineet K.
    Lu, Michael T.
    Neilan, Tomas G.
    Heemelaar, Julius C.
    EUROPEAN RADIOLOGY, 2024,
  • [2] Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
    Bozkurt, Selen
    Alkim, Emel
    Banerjee, Imon
    Rubin, Daniel L.
    JOURNAL OF DIGITAL IMAGING, 2019, 32 (04) : 544 - 553
  • [3] Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm
    Selen Bozkurt
    Emel Alkim
    Imon Banerjee
    Daniel L. Rubin
    Journal of Digital Imaging, 2019, 32 : 544 - 553
  • [4] NATURAL LANGUAGE PROCESSING FOR THE AUTOMATED QUANTIFICATION OF BRAIN METASTASES IN RADIOLOGY FREE TEXT REPORTS
    Cote, David
    Senders, Joeky
    Karhade, Aditya
    Gupta, Saksham
    Lamba, Nayan
    Hancock, Brooke
    Smith, Timothy
    Arnaout, Omar
    NEURO-ONCOLOGY, 2017, 19 : 44 - 45
  • [5] Identification of gallstones from radiology reports using natural language processing
    Fairfield, Cameron
    Ots, Riinu
    Antai, Roseline
    Drake, Tom
    Knight, Stephen
    Wigmore, Stephen
    Harrison, Ewen
    BRITISH JOURNAL OF SURGERY, 2018, 105 : 58 - 58
  • [6] Automated vetting of radiology referrals: exploring natural language processing and traditional machine learning approaches
    Jaka Potočnik
    Edel Thomas
    Ronan Killeen
    Shane Foley
    Aonghus Lawlor
    John Stowe
    Insights into Imaging, 13
  • [7] Automated vetting of radiology referrals: exploring natural language processing and traditional machine learning approaches
    Potocnik, Jaka
    Thomas, Edel
    Killeen, Ronan
    Foley, Shane
    Lawlor, Aonghus
    Stowe, John
    INSIGHTS INTO IMAGING, 2022, 13 (01)
  • [8] Extracting information on pneumonia in infants using natural language processing of radiology reports
    Mendonça, EA
    Haas, J
    Shagina, L
    Larson, E
    Friedman, C
    JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (04) : 314 - 321
  • [9] Automated interpretation of stress echocardiography reports using natural language processing
    Zheng, Chengyi
    Sun, Benjamin C.
    Wu, Yi-Lin
    Ferencik, Maros
    Lee, Ming-Sum
    Redberg, Rita F.
    Kawatkar, Aniket A.
    Musigdilok, Visanee V.
    Sharp, Adam L.
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2022, 3 (04): : 626 - 637
  • [10] Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches
    Solti, Imre
    Cooke, Colin R.
    Xia, Fei
    Wurfel, Mark M.
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 308 - +