Use of Natural Language Processing to Extract and Classify Papillary Thyroid Cancer Features From Surgical Pathology Reports

被引:2
|
作者
Loor-Torres, Ricardo [1 ]
Wu, Yuqi [2 ]
Cabezas, Esteban [1 ]
Borras-Osorio, Mariana [1 ]
Toro-Tobon, David [3 ]
Duran, Mayra [1 ]
Al Zahidy, Misk [1 ]
Chavez, Maria Mateo [1 ]
Jacome, Cristian Soto [1 ]
Fan, Jungwei W. [2 ]
Ospina, Naykky M. Singh [4 ]
Wu, Yonghui [5 ]
Brito, Juan P. [1 ,3 ]
机构
[1] Mayo Clin, Div Endocrinol Diabet Nutr & Metab, Knowledge & Evaluat Res Unit, 200 First St SW, Rochester, MN 55902 USA
[2] Mayo Clin, Dept Artificial Intelligence & Informat, Rochester, MN USA
[3] Mayo Clin, Div Endocrinol Diabet Metab & Nutr, Rochester, MN USA
[4] Univ Florida, Dept Med, Div Endocrinol, Gainesville, FL USA
[5] Univ Florida, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
基金
美国国家卫生研究院;
关键词
artificial intelligence; Natural Language Processing; thyroid cancer;
D O I
10.1016/j.eprac.2024.08.008
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: We aim to use Natural Language Processing to automate the extraction and classification of thyroid cancer risk factors from pathology reports. Methods: We analyzed 1410 surgical pathology reports from adult papillary thyroid cancer patients from 2010 to 2019. Structured and nonstructured reports were used to create a consensus-based ground truth dictionary and categorized them into modified recurrence risk levels. Nonstructured reports were narrative, while structured reports followed standardized formats. We developed ThyroPath, a rule-based Natural Language Processing pipeline, to extract and classify thyroid cancer features into risk categories. Training involved 225 reports (150 structured, 75 unstructured), with testing on 170 reports (120 structured, 50 unstructured) for evaluation. The pipeline's performance was assessed using both strict and lenient criteria for accuracy, precision, recall, and F1-score; a metric that combines precision and recall evaluation. Results: In extraction tasks, ThyroPath achieved overall strict F-1 scores of 93% for structured reports and 90% for unstructured reports, covering 18 thyroid cancer pathology features. In classification tasks, ThyroPath-extracted information demonstrated an overall accuracy of 93% in categorizing reports based on their corresponding guideline-based risk of recurrence: 76.9% for high-risk, 86.8% for intermediate risk, and 100% for both low and very low-risk cases. However, ThyroPath achieved 100% accuracy across all risk categories with human extracted pathology information. Conclusions: ThyroPath shows promise in automating the extraction and risk recurrence classification of thyroid pathology reports at large scale. It offers a solution to laborious manual reviews and advancing virtual registries. However, it requires further validation before implementation. (c) 2024 AACE. Published by Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
引用
收藏
页码:1051 / 1058
页数:8
相关论文
共 50 条
  • [1] Leveraging Natural Language Processing to Extract Features of Colorectal Polyps From Pathology Reports for Epidemiologic Study
    Benson, Ryzen
    Winterton, Candace
    Winn, Maci
    Krick, Benjamin
    Liu, Mei
    Abu-el-Rub, Noor
    Conway, Mike
    Del Fiol, Guilherme
    Gawron, Andrew
    Hardikar, Sheetal
    JCO CLINICAL CANCER INFORMATICS, 2023, 7 : e2200131
  • [2] Leveraging Natural Language Processing to Extract Features of Colorectal Polyps From Pathology Reports for Epidemiologic Study
    Benson, Ryzen
    Winterton, Candace
    Winn, Maci
    Krick, Benjamin
    Liu, Mei
    Abu-el-rub, Noor
    Conway, Mike
    Del Fiol, Guilherme
    Gawron, Andrew
    Hardikar, Sheetal
    JCO CLINICAL CANCER INFORMATICS, 2023, 7
  • [3] Using Natural Language Processing to Extract and Classify Symptoms Among Patients with Thyroid Dysfunction
    Hwang, Sy
    Reddy, Sujatha
    Wainwright, Katherine
    Schriver, Emily
    Cappola, Anne
    Mowery, Danielle
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 614 - 618
  • [4] Development and Validation of a Natural Language Processing Algorithm for Extracting Clinical and Pathological Features of Breast Cancer From Pathology Reports
    Munzone, Elisabetta
    Marra, Antonio
    Comotto, Federico
    Guercio, Lorenzo
    Sangalli, Claudia Anna
    Lo Cascio, Martina
    Pagan, Eleonora
    Sangalli, Davide
    Bigoni, Ilaria
    Porta, Francesca Maria
    D'Ercole, Marianna
    Ritorti, Fabiana
    Bagnardi, Vincenzo
    Fusco, Nicola
    Curigliano, Giuseppe
    JCO CLINICAL CANCER INFORMATICS, 2024, 8
  • [5] Using Natural Language Processing to Extract Abnormal Results From Cancer Screening Reports
    Moore, Carlton R.
    Farrag, Ashraf
    Ashkin, Evan
    JOURNAL OF PATIENT SAFETY, 2017, 13 (03) : 138 - 143
  • [6] Facilitating cancer research using natural language processing of pathology reports
    Xu, H
    Anderson, K
    Grann, VR
    Friedman, C
    MEDINFO 2004: PROCEEDINGS OF THE 11TH WORLD CONGRESS ON MEDICAL INFORMATICS, PT 1 AND 2, 2004, 107 : 565 - 569
  • [7] DEVELOPMENT AND VALIDATION OF A NATURAL LANGUAGE PROCESSING TOOL TO EXTRACT STEATOHEPATITIS DIAGNOSES FROM LIVER BIOPSY PATHOLOGY REPORTS
    Lynch, Julie
    Mukherjee, Samiran
    Chang, Kyong-Mi
    Pridgen, Kathryn
    Alba, Patrick
    HEPATOLOGY, 2024, 80 : S466 - S467
  • [8] Natural Language Processing Accurately Categorizes Findings From Colonoscopy and Pathology Reports
    Imler, Timothy D.
    Morea, Justin
    Kahi, Charles
    Imperiale, Thomas F.
    CLINICAL GASTROENTEROLOGY AND HEPATOLOGY, 2013, 11 (06) : 689 - 694
  • [9] USING NATURAL LANGUAGE PROCESSING TO EXTRACT ABNORMAL RESULTS FROM MAMMOGRAPHY REPORTS
    Moore, Carlton R.
    Farrag, Ashraf
    Ashkin, Evan
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2013, 28 : S235 - S235
  • [10] Lymphatic Vessel Invasion in Routine Pathology Reports of Papillary Thyroid Cancer
    Chiapponi, Costanza
    Alakus, Hakan
    Schmidt, Matthias
    Faust, Michael
    Bruns, Christiane J.
    Buettner, Reinhard
    Eich, Marie-Lisa
    Schultheis, Anne M.
    FRONTIERS IN MEDICINE, 2022, 9