Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework

被引:29
|
作者
Lybarger, Kevin [1 ]
Ostendorf, Mari [2 ]
Thompson, Matthew [3 ]
Yetisgen, Meliha [1 ]
机构
[1] Univ Washington, Biomed & Hlth Informat, Box 358047, Seattle, WA 98109 USA
[2] Univ Washington, Dept Elect & Comp Engn, Campus Box 352500 185, Seattle, WA 98195 USA
[3] Univ Washington, Dept Family Med, Box 354696, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
COVID-19; Coronavirus; Machine learning; Natural language processing; Information extraction; METAMAP;
D O I
10.1016/j.jbi.2021.103761
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has been learned about the novel coronavirus since its emergence, there are many open questions related to tracking its spread, describing symptomology, predicting the severity of infection, and forecasting healthcare utilization. Free-text clinical notes contain critical information for resolving these questions. Data-driven, automatic information extraction models are needed to use this text-encoded information in large-scale studies. This work presents a new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT) Corpus, which comprises 1,472 notes with detailed annotations characterizing COVID-19 diagnoses, testing, and clinical presentation. We introduce a span-based event extraction model that jointly extracts all annotated phenomena, achieving high performance in identifying COVID-19 and symptom events with associated assertion values (0.83-0.97 F1 for events and 0.73-0.79 F1 for assertions). Our span-based event extraction model outperforms an extractor built on MetaMapLite for the identification of symptoms with assertion values. In a secondary use application, we predicted COVID-19 test results using structured patient data (e.g. vital signs and laboratory results) and automatically extracted symptom information, to explore the clinical presentation of COVID-19. Automatically extracted symptoms improve COVID-19 prediction performance, beyond structured data alone.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Neurological pathogenesis of SARS-CoV-2 (COVID-19): from virological features to clinical symptoms
    Yoshitaka Kase
    Hideyuki Okano
    Inflammation and Regeneration, 41
  • [32] Case Report: Recurrent Clinical Symptoms of COVID-19 in Healthcare Professionals: A Series of Cases from Brazil
    Valente Takeda, Christianne Fernandes
    de Almeida, Magda Moura
    de Aguiar Gomes, Ricristhi Goncalves
    Souza, Tatiana Cisne
    de Lima Mota, Matheus Alves
    de Goes Cavalcanti, Luciano Pamplona
    Baima Colares, Jeova Keny
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2020, 103 (05): : 1993 - 1996
  • [33] Neurological pathogenesis of SARS-CoV-2 (COVID-19): from virological features to clinical symptoms
    Kase, Yoshitaka
    Okano, Hideyuki
    INFLAMMATION AND REGENERATION, 2021, 41 (01)
  • [34] Lessons learned from developing a COVID-19 algorithm governance framework in Aotearoa New Zealand
    Wilson, Daniel
    Tweedie, Frith
    Rumball-Smith, Juliet
    Ross, Kevin
    Kazemi, Alex
    Galvin, Vince
    Dobbie, Gillian
    Dare, Tim
    Brown, Pieta
    Blakey, Judy
    JOURNAL OF THE ROYAL SOCIETY OF NEW ZEALAND, 2023, 53 (01) : 82 - 94
  • [35] Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
    Shaina Raza
    Brian Schwartz
    BMC Medical Informatics and Decision Making, 23
  • [36] Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
    Raza, Shaina
    Schwartz, Brian
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [37] Multimodality Cardiac Imaging, Cardiac Symptoms, and Clinical Outcomes in Patients Who Recovered from Mild COVID-19
    Hanneman, Kate
    Houbois, Christian
    Kei, Tiffanie
    Gustafson, Dakota
    Thampinathan, Babitha
    Sooriyakanthan, Maala
    Fish, Jason E.
    Howe, Kathryn L.
    Cheung, Angela M.
    Wintersperger, Bernd J.
    Gold, Wayne L.
    Woo, Anna
    Thavendiranathan, Paaladinesh
    RADIOLOGY, 2023, 308 (01)
  • [38] Comparative profile for COVID-19 cases from China and North America: Clinical symptoms, comorbidities and disease biomarkers
    Alaa Badawi
    Denitsa Vasileva
    World Journal of Clinical Cases, 2021, 9 (01) : 118 - 132
  • [39] Augmented curation of clinical notes from a massive EHR system reveals symptoms of impending COVID-19 diagnosis
    Wagner, Tyler
    Shweta, F. N. U.
    Murugadoss, Karthik
    Awasthi, Samir
    Venkatakrishnan, A. J.
    Bade, Sairam
    Puranik, Arjun
    Kang, Martin
    Pickering, Brian W.
    O'Horo, John C.
    Bauer, Philippe R.
    Razonable, Raymund R.
    Vergidis, Paschalis
    Temesgen, Zelalem
    Rizza, Stacey
    Mahmood, Maryam
    Wilson, Walter R.
    Challener, Douglas
    Anand, Praveen
    Liebers, Matt
    Doctor, Zainab
    Silvert, Eli
    Solomon, Hugo
    Anand, Akash
    Barve, Rakesh
    Gores, Gregory
    Williams, Amy W.
    Morice, William G., II
    Halamka, John
    Badley, Andrew
    Soundararajan, Venky
    ELIFE, 2020, 9 : 1 - 12
  • [40] Comparative profile for COVID-19 cases from China and North America: Clinical symptoms, comorbidities and disease biomarkers
    Badawi, Alaa
    Vasileva, Denitsa
    WORLD JOURNAL OF CLINICAL CASES, 2021, 9 (01) : 118 - 132