A Large Language Model to Detect Negated Expressions in Radiology Reports

被引:0
|
作者
Su, Yvonne [1 ]
Babore, Yonatan B. [1 ]
Kahn Jr, Charles E. [1 ,2 ]
机构
[1] Univ Penn, Perelman Sch Med, Dept Radiol, 3400 Spruce St, Philadelphia, PA 19104 USA
[2] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
关键词
Large language models; Negated expression (negex) detection; Named entity recognition; Natural language processing; Radiology reports; SYSTEM;
D O I
10.1007/s10278-024-01274-9
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Natural language processing (NLP) is crucial to extract information accurately from unstructured text to provide insights for clinical decision-making, quality improvement, and medical research. This study compared the performance of a rule-based NLP system and a medical-domain transformer-based model to detect negated concepts in radiology reports. Using a corpus of 984 de-identified radiology reports from a large U.S.-based academic health system (1000 consecutive reports, excluding 16 duplicates), the investigators compared the rule-based medspaCy system and the Clinical Assertion and Negation Classification Bidirectional Encoder Representations from Transformers (CAN-BERT) system to detect negated expressions of terms from RadLex, the Unified Medical Language System Metathesaurus, and the Radiology Gamuts Ontology. Power analysis determined a sample size of 382 terms to achieve alpha = 0.05 and beta = 0.8 for McNemar's test; based on an estimate of 15% negated terms, 2800 randomly selected terms were annotated manually as negated or not negated. Precision, recall, and F1 of the two models were compared using McNemar's test. Of the 2800 terms, 387 (13.8%) were negated. For negation detection, medspaCy attained a recall of 0.795, precision of 0.356, and F1 of 0.492. CAN-BERT achieved a recall of 0.785, precision of 0.768, and F1 of 0.777. Although recall was not significantly different, CAN-BERT had significantly better precision (chi(2) = 304.64; p < 0.001). The transformer-based CAN-BERT model detected negated terms in radiology reports with high precision and recall; its precision significantly exceeded that of the rule-based medspaCy system. Use of this system will improve data extraction from textual reports to support information retrieval, AI model training, and discovery of causal relationships.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models
    Langenbach, Marcel C.
    Foldyna, Borek
    Hadzic, Ibrahim
    Langenbach, Isabel L.
    Raghu, Vineet K.
    Lu, Michael T.
    Neilan, Tomas G.
    Heemelaar, Julius C.
    EUROPEAN RADIOLOGY, 2024,
  • [22] Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports
    Le Guellec, Bastien
    Lefevre, Alexandre
    Geay, Charlotte
    Shorten, Lucas
    Bruge, Cyril
    Hacein-Bey, Lotfi
    Amouyel, Philippe
    Pruvo, Jean-Pierre
    Kuchcinski, Gregory
    Hamroun, Aghiles
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2024, 6 (04)
  • [23] Large Language Model Approach for Zero-Shot Information Extraction and Clustering of Japanese Radiology Reports: Algorithm Development and Validation
    Yamagishi, Yosuke
    Nakamura, Yuta
    Hanaoka, Shouhei
    Abe, Osamu
    JMIR CANCER, 2025, 11
  • [24] Large language model-based information extraction from free-text radiology reports: a scoping review protocol
    Reichenpfader, Daniel
    Muller, Henning
    Denecke, Kerstin
    BMJ OPEN, 2023, 13 (12):
  • [25] Large Language Model Use in Radiology Residency Applications: Unwelcomed but Inevitable
    Gordon, Emile B.
    Maxfield, Charles M.
    French, Robert
    Fish, Laura J.
    Romm, Jacob
    Barre, Emily
    Kinne, Erica
    Peterson, Ryan
    Grimm, Lars J.
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2025, 22 (01) : 33 - 40
  • [26] Provision of Radiology Reports Simplified With Large Language Models to Patients With Cancer: Impact on Patient Satisfaction
    Gupta, Amit
    Singh, Swarndeep
    Malhotra, Hema
    Pruthi, Himanshu
    Sharma, Aparna
    Garg, Amit K.
    Yadav, Mukesh
    Kandasamy, Devasenathipathy
    Batra, Atul
    Rangarajan, Krithika
    JCO CLINICAL CANCER INFORMATICS, 2025, 9
  • [27] Automatic structuring of radiology reports with on-premise open-source large language models
    Woznicki, Piotr
    Laqua, Caroline
    Fiku, Ina
    Hekalo, Amar
    Truhn, Daniel
    Engelhardt, Sandy
    Kather, Jakob
    Foersch, Sebastian
    D'Antonoli, Tugba Akinci
    dos Santos, Daniel Pinto
    Baessler, Bettina
    Laqua, Fabian Christopher
    EUROPEAN RADIOLOGY, 2025, 35 (04) : 2018 - 2029
  • [28] A LOCAL LARGE LANGUAGE MODEL PIPELINE AUTOMATICALLY RISK STRATIFIES PANCREATIC CYSTS FOR POPULATION HEALTH MANAGEMENT FROM SERIAL RADIOLOGY REPORTS
    Zhong, Jiayang
    Sehgal, Kanika
    Hickey, Kyle
    Mohammad, Aziza
    Robinson, Stephen
    Farrell, James J.
    Shung, Dennis
    GASTROENTEROLOGY, 2024, 166 (05) : S687 - S687
  • [29] Letter to the Editor: A critical evaluation on the use of large language model for radiology research
    Ray, Partha Pratim
    EUROPEAN RADIOLOGY, 2023, 33 (12) : 9462 - 9463
  • [30] Letter to the Editor: A critical evaluation on the use of large language model for radiology research
    Partha Pratim Ray
    European Radiology, 2023, 33 : 9462 - 9463