Information extraction from weakly structured radiological reports with natural language queries

被引:4
|
作者
Dada, Amin [1 ]
Ufer, Tim Leon [1 ]
Kim, Moon [1 ]
Hasin, Max [1 ]
Spieker, Nicola [2 ]
Forsting, Michael [1 ,3 ]
Nensa, Felix [1 ,3 ]
Egger, Jan [1 ,4 ]
Kleesiek, Jens [1 ,2 ,5 ]
机构
[1] Univ Hosp Essen, Inst AI Med IKIM, Girardetstr 2, D-45131 Essen, Germany
[2] Dr Kruger MVZ GmbH, Bocholt, Germany
[3] Univ Hosp Essen, Inst Diagnost & Intervent Radiol & Neuroradiol, Essen, Germany
[4] Univ Med Essen, Canc Res Ctr Cologne Essen CCCE, Essen, Germany
[5] German Canc Consortium DKTK, Partner Site Essen, Essen, Germany
关键词
Information extraction; Natural language processing; Machine learning;
D O I
10.1007/s00330-023-09977-3
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
ObjectivesProvide physicians and researchers an efficient way to extract information from weakly structured radiology reports with natural language processing (NLP) machine learning models.MethodsWe evaluate seven different German bidirectional encoder representations from transformers (BERT) models on a dataset of 857,783 unlabeled radiology reports and an annotated reading comprehension dataset in the format of SQuAD 2.0 based on 1223 additional reports.ResultsContinued pre-training of a BERT model on the radiology dataset and a medical online encyclopedia resulted in the most accurate model with an F1-score of 83.97% and an exact match score of 71.63% for answerable questions and 96.01% accuracy in detecting unanswerable questions. Fine-tuning a non-medical model without further pre-training led to the lowest-performing model. The final model proved stable against variation in the formulations of questions and in dealing with questions on topics excluded from the training set.ConclusionsGeneral domain BERT models further pre-trained on radiological data achieve high accuracy in answering questions on radiology reports. We propose to integrate our approach into the workflow of medical practitioners and researchers to extract information from radiology reports.Clinical relevance statementBy reducing the need for manual searches of radiology reports, radiologists' resources are freed up, which indirectly benefits patients.Key Points center dot BERT models pre-trained on general domain datasets and radiology reports achieve high accuracy (83.97% F1-score) on question-answering for radiology reports.center dot The best performing model achieves an F1-score of 83.97% for answerable questions and 96.01% accuracy for questions without an answer.center dot Additional radiology-specific pretraining of all investigated BERT models improves their performance.Key Points center dot BERT models pre-trained on general domain datasets and radiology reports achieve high accuracy (83.97% F1-score) on question-answering for radiology reports.center dot The best performing model achieves an F1-score of 83.97% for answerable questions and 96.01% accuracy for questions without an answer.center dot Additional radiology-specific pretraining of all investigated BERT models improves their performance.Key Points center dot BERT models pre-trained on general domain datasets and radiology reports achieve high accuracy (83.97% F1-score) on question-answering for radiology reports.center dot The best performing model achieves an F1-score of 83.97% for answerable questions and 96.01% accuracy for questions without an answer.center dot Additional radiology-specific pretraining of all investigated BERT models improves their performance.
引用
收藏
页码:330 / 337
页数:8
相关论文
共 50 条
  • [1] Information extraction from weakly structured radiological reports with natural language queries
    Amin Dada
    Tim Leon Ufer
    Moon Kim
    Max Hasin
    Nicola Spieker
    Michael Forsting
    Felix Nensa
    Jan Egger
    Jens Kleesiek
    [J]. European Radiology, 2024, 34 : 330 - 337
  • [2] Explaining Structured Queries in Natural Language
    Koutrika, Georgia
    Simitsis, Alkis
    Ioannidis, Yannis E.
    [J]. 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 333 - 344
  • [3] Information extraction from German radiological reports for general clinical text and language understanding
    Jantscher, Michael
    Gunzer, Felix
    Kern, Roman
    Hassler, Eva
    Tschauner, Sebastian
    Reishofer, Gernot
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [4] Information extraction from German radiological reports for general clinical text and language understanding
    Michael Jantscher
    Felix Gunzer
    Roman Kern
    Eva Hassler
    Sebastian Tschauner
    Gernot Reishofer
    [J]. Scientific Reports, 13
  • [5] MEDSYNDIKATE - a natural language system for the extraction of medical information from findings reports
    Hahn, U
    Romacker, M
    Schulz, S
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2002, 67 (1-3) : 63 - 74
  • [6] Learning to generate structured queries from natural language with indirect supervision
    Bai, Ziwei
    Yu, Bo
    Wu, Bowen
    Wang, Zhuoran
    Wang, Baoxun
    [J]. COMPUTER SPEECH AND LANGUAGE, 2021, 67
  • [7] EXTRACTING STRUCTURED INFORMATION FROM PATHOLOGY REPORTS USING NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Odisho, Anobel
    Park, Briton
    Altieri, Nicholas
    Murdoch, William
    Carroll, Peter
    Coopberberg, Matthew
    Yu, Bin
    [J]. JOURNAL OF UROLOGY, 2019, 201 (04): : E1031 - E1032
  • [8] Translation of natural language queries to structured data sources
    Posevkin, Ruslan
    Bessmertny, Igor
    [J]. 2015 9TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2015, : 57 - 59
  • [9] NATURAL LANGUAGE PROCESSING OF ESOPHAGOGASTRODUODENOSCOPY REPORTS FOR INFORMATION EXTRACTION OF GASTRIC DISEASES
    Bae, Jung Ho
    Han, Hyun Wook
    Song, Gyuseon
    [J]. GASTROINTESTINAL ENDOSCOPY, 2022, 95 (06) : AB247 - AB248
  • [10] Harnessing Natural Language Processing for Structured Information Extraction from Radiology Reports in Crohn's Disease: A Nationwide Study From the epi-IIRN
    Hazan, L.
    Focht, G.
    Gavrielov, N.
    Reichart, R.
    Friss, C.
    Kuint, R. Cytter
    Turner, D.
    Freiman, M.
    [J]. JOURNAL OF CROHNS & COLITIS, 2024, 18 : I626 - I627