Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework

被引:29
|
作者
Lybarger, Kevin [1 ]
Ostendorf, Mari [2 ]
Thompson, Matthew [3 ]
Yetisgen, Meliha [1 ]
机构
[1] Univ Washington, Biomed & Hlth Informat, Box 358047, Seattle, WA 98109 USA
[2] Univ Washington, Dept Elect & Comp Engn, Campus Box 352500 185, Seattle, WA 98195 USA
[3] Univ Washington, Dept Family Med, Box 354696, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
COVID-19; Coronavirus; Machine learning; Natural language processing; Information extraction; METAMAP;
D O I
10.1016/j.jbi.2021.103761
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has been learned about the novel coronavirus since its emergence, there are many open questions related to tracking its spread, describing symptomology, predicting the severity of infection, and forecasting healthcare utilization. Free-text clinical notes contain critical information for resolving these questions. Data-driven, automatic information extraction models are needed to use this text-encoded information in large-scale studies. This work presents a new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT) Corpus, which comprises 1,472 notes with detailed annotations characterizing COVID-19 diagnoses, testing, and clinical presentation. We introduce a span-based event extraction model that jointly extracts all annotated phenomena, achieving high performance in identifying COVID-19 and symptom events with associated assertion values (0.83-0.97 F1 for events and 0.73-0.79 F1 for assertions). Our span-based event extraction model outperforms an extractor built on MetaMapLite for the identification of symptoms with assertion values. In a secondary use application, we predicted COVID-19 test results using structured patient data (e.g. vital signs and laboratory results) and automatically extracted symptom information, to explore the clinical presentation of COVID-19. Automatically extracted symptoms improve COVID-19 prediction performance, beyond structured data alone.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Prevalence of GI Symptoms in COVID-19 Patients and Their Impact on Clinical Outcomes, from the Epicenter of COVID-19: Multicenter Study of Academic Centers in Brooklyn
    Bashir, Muhammad H.
    Nawaz, Mohammad
    Celdir, Melis
    Yunina, Daria
    Al-Ani, Firas
    Crowe, Brooks R.
    Silver, Benjamin S.
    Gupta, Nealansh E.
    Parikh, Malav P.
    Chokshi, Tanuj
    Grossman, Evan B.
    El Younis, Cherif
    Tsirlin, Yuriy
    Veluvolu, Rajesh
    Vignesh, Shivakumar
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2020, 115 : S289 - S290
  • [22] Nlp methods for extraction of symptoms from unstructured data for use in prognostic covid-19 analytic models
    Silverman, Greg M.
    Sahoo, Himanshu S.
    Ingraham, Nicholas E.
    Lupei, Monica
    Puskarich, Michael A.
    Usher, Michael
    Dries, James
    Finzel, Raymond L.
    Murray, Eric
    Sartori, John
    Simon, Gyorgy
    Zhang, Rui
    Melton, Genevieve B.
    Tignanelli, Christopher J.
    Pakhomov, Serguei V.S.
    1600, AI Access Foundation (72): : 429 - 474
  • [23] NLP Methods for Extraction of Symptoms from Unstructured Data for Use in Prognostic COVID-19 Analytic Models
    Silverman, Greg M.
    Sahoo, Himanshu S.
    Ingraham, Nicholas E.
    Lupei, Monica
    Puskarich, Michael A.
    Usher, Michael
    Dries, James
    Finzel, Raymond L.
    Murray, Eric
    Sartori, John
    Simon, Gyorgy
    Zhang, Rui
    Melton, Genevieve B.
    Tignanelli, Christopher J.
    Pakhomov, Serguei V. S.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 72 : 429 - 474
  • [24] Trends in psychiatric diagnoses by COVID-19 infection and hospitalization among patients with and without recent clinical psychiatric diagnoses in New York city from March 2020 to August 2021
    Xiao, Yunyu
    Sharma, Mohit M.
    Thiruvalluru, Rohith K.
    Gimbrone, Catherine
    Weissman, Myrna M.
    Olfson, Mark
    Keyes, Katherine M.
    Pathak, Jyotishman
    TRANSLATIONAL PSYCHIATRY, 2022, 12 (01)
  • [25] Trends in psychiatric diagnoses by COVID-19 infection and hospitalization among patients with and without recent clinical psychiatric diagnoses in New York city from March 2020 to August 2021
    Yunyu Xiao
    Mohit M. Sharma
    Rohith K. Thiruvalluru
    Catherine Gimbrone
    Myrna M. Weissman
    Mark Olfson
    Katherine M. Keyes
    Jyotishman Pathak
    Translational Psychiatry, 12
  • [26] Addressing Inequality in the COVID-19 Pandemic in Africa: A Snapshot from Clinical Symptoms to Vaccine Distribution
    Pego, Ana Catarina
    Lima, Illyane Sofia
    Gozzelino, Raffaella
    COVID, 2024, 4 (02): : 170 - 190
  • [27] The fear of COVID-19 infection is the main cause of the new diagnoses of hand eczema: Report from the frontline in Milan
    Giacalone, Serena
    Bortoluzzi, Paolo
    Nazzaro, Gianluca
    DERMATOLOGIC THERAPY, 2020, 33 (04)
  • [28] Scanning the Medical Phenome to Identify New Medical Diagnoses After Recovery from COVID-19 in a US Cohort
    Kerchberger, V. E.
    Ware, L. B.
    Bastarache, J. A.
    Peterson, J. F.
    Wei, W.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2022, 205
  • [29] COVID-19 Diagnosis by Extracting New Features from Lung CT Images Using Fractional Fourier Transform
    Nokhostin, Ali
    Rashidi, Saeid
    FRACTAL AND FRACTIONAL, 2024, 8 (04)
  • [30] A New Approach to Extracting Tourism Focus Points from Chinese Inbound Tourist Reviews after COVID-19
    Liu, Zhenzhen
    Masui, Fumito
    Eronen, Juuso
    Terashita, Shun
    Ptaszynski, Michal
    SUSTAINABILITY, 2023, 15 (11)