Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing

被引:16
|
作者
Callen, Andrew L. [1 ]
Dupont, Sara M. [2 ]
Price, Adi [3 ]
Laguna, Ben [3 ]
McCoy, David [3 ]
Do, Bao [4 ]
Talbott, Jason [3 ]
Kohli, Marc [3 ]
Narvid, Jared [3 ]
机构
[1] Univ Colorado, Dept Radiol, Anschutz Med Campus, Denver, CO 80045 USA
[2] Sublte Med Inc, Menlo Pk, CA USA
[3] Univ Calif San Francisco, Dept Radiol & Biomed Imaging, San Francisco, CA 94143 USA
[4] Stanford Univ, Med Ctr, Dept Radiol, Stanford, CA 94305 USA
基金
美国国家卫生研究院;
关键词
Diagnostic uncertainty; Natural language processing; MALPRACTICE; INFORMATION; ACCURACY; MEDICINE; ERRORS;
D O I
10.1007/s10278-020-00379-1
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
The ideal radiology report reduces diagnostic uncertainty, while avoiding ambiguity whenever possible. The purpose of this study was to characterize the use of uncertainty terms in radiology reports at a single institution and compare the use of these terms across imaging modalities, anatomic sections, patient characteristics, and radiologist characteristics. We hypothesized that there would be variability among radiologists and between subspecialities within radiology regarding the use of uncertainty terms and that the length of the impression of a report would be a predictor of use of uncertainty terms. Finally, we hypothesized that use of uncertainty terms would often be interpreted by human readers as "hedging." To test these hypotheses, we applied a natural language processing (NLP) algorithm to assess and count the number of uncertainty terms within radiology reports. An algorithm was created to detect usage of a published set of uncertainty terms. All 642,569 radiology report impressions from 171 reporting radiologists were collected from 2011 through 2015. For validation, two radiologists without knowledge of the software algorithm reviewed report impressions and were asked to determine whether the report was "uncertain" or "hedging." The relationship between the presence of 1 or more uncertainty terms and the human readers' assessment was compared. There were significant differences in the proportion of reports containing uncertainty terms across patient admission status and across anatomic imaging subsections. Reports with uncertainty were significantly longer than those without, although report length was not significantly different between subspecialities or modalities. There were no significant differences in rates of uncertainty when comparing the experience of the attending radiologist. When compared with reader 1 as a gold standard, accuracy was 0.91, sensitivity was 0.92, specificity was 0.9, and precision was 0.88, with an F1-score of 0.9. When compared with reader 2, accuracy was 0.84, sensitivity was 0.88, specificity was 0.82, and precision was 0.68, with an F1-score of 0.77. Substantial variability exists among radiologists and subspecialities regarding the use of uncertainty terms, and this variability cannot be explained by years of radiologist experience or differences in proportions of specific modalities. Furthermore, detection of uncertainty terms demonstrates good test characteristics for predicting human readers' assessment of uncertainty.
引用
收藏
页码:1194 / 1201
页数:8
相关论文
共 50 条
  • [31] Identifying Patients With Pulmonary Nodules From CT Radiology Reports Using Natural Language Processing (NLP)
    Dotson, T. L.
    Gasimova, A.
    Watkins, J.
    Chometon, Q.
    Bellinger, C. R.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2023, 207
  • [32] Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports
    Minjae Kim
    Kai Tzu-iunn Ong
    Seonah Choi
    Jinyoung Yeo
    Sooyon Kim
    Kyunghwa Han
    Ji Eun Park
    Ho Sung Kim
    Yoon Seong Choi
    Sung Soo Ahn
    Jinna Kim
    Seung-Koo Lee
    Beomseok Sohn
    European Radiology, 2023, 33 : 8017 - 8025
  • [33] IDENTIFYING EROSIVE DISEASE FROM RADIOLOGY REPORTS OF VETERANS WITH INFLAMMATORY ARTHRITIS USING NATURAL LANGUAGE PROCESSING
    Penmetsa, G.
    Pei, S.
    Sauer, B.
    Walsh, J. A.
    Feng, B.
    Walker, J.
    Douglas, K.
    Clewell, J.
    ANNALS OF THE RHEUMATIC DISEASES, 2021, 80 : 353 - 354
  • [34] Review of Natural Language Processing in Radiology
    Luo, Jack W.
    Chong, Jaron J. R.
    NEUROIMAGING CLINICS OF NORTH AMERICA, 2020, 30 (04) : 447 - +
  • [35] Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing
    Gandomi, Amir
    Hasan, Eusha
    Chusid, Jesse
    Paul, Subroto
    Inra, Matthew
    Makhnevich, Alex
    Raoof, Suhail
    Silvestri, Gerard
    Bade, Brett C.
    Cohen, Stuart L.
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 191
  • [36] Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models
    Langenbach, Marcel C.
    Foldyna, Borek
    Hadzic, Ibrahim
    Langenbach, Isabel L.
    Raghu, Vineet K.
    Lu, Michael T.
    Neilan, Tomas G.
    Heemelaar, Julius C.
    EUROPEAN RADIOLOGY, 2024, : 2634 - 2641
  • [37] NATURAL LANGUAGE PROCESSING FOR THE AUTOMATED QUANTIFICATION OF BRAIN METASTASES IN RADIOLOGY FREE TEXT REPORTS
    Cote, David
    Senders, Joeky
    Karhade, Aditya
    Gupta, Saksham
    Lamba, Nayan
    Hancock, Brooke
    Smith, Timothy
    Arnaout, Omar
    NEURO-ONCOLOGY, 2017, 19 : 44 - 45
  • [38] Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports
    Kehl, Kenneth L.
    Elmarakeby, Haitham
    Nishino, Mizuki
    Van Allen, Eliezer M.
    Lepisto, Eva M.
    Hassett, Michael J.
    Johnson, Bruce E.
    Schrag, Deborah
    JAMA ONCOLOGY, 2019, 5 (10) : 1421 - 1429
  • [39] Identification of Long Bone Fractures in Radiology Reports Using Natural Language Processing to support Healthcare Quality Improvement
    Grundmeier, Robert W.
    Masino, Aaron J.
    Casper, T. Charles
    Dean, Jonathan M.
    Bell, Jamie
    Enriquez, Rene
    Deakyne, Sara
    Chamberlain, James M.
    Alpern, Elizabeth R.
    APPLIED CLINICAL INFORMATICS, 2016, 7 (04): : 1051 - 1068
  • [40] Analysis of Stroke Detection during the COVID-19 Pandemic Using Natural Language Processing of Radiology Reports
    Li, M. D.
    Lang, M.
    Deng, F.
    Chang, K.
    Buch, K.
    Rincon, S.
    Mehan, W. A.
    Leslie-Mazwi, T. M.
    Kalpathy-Cramer, J.
    AMERICAN JOURNAL OF NEURORADIOLOGY, 2021, 42 (03) : 429 - 434