An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)

被引:4
|
作者
Liu, Sijia [1 ]
Wen, Andrew [1 ]
Wang, Liwei [1 ]
He, Huan [1 ]
Fu, Sunyang [1 ]
Miller, Robert [2 ]
Williams, Andrew [2 ]
Harris, Daniel [3 ]
Kavuluru, Ramakanth [3 ]
Liu, Mei [4 ]
Abu-el-Rub, Noor [4 ]
Schutte, Dalton [5 ]
Zhang, Rui [5 ]
Rouhizadeh, Masoud [6 ]
Osborne, John D. [7 ]
He, Yongqun [8 ]
Topaloglu, Umit [9 ]
Hong, Stephanie S. [10 ]
Saltz, Joel H. [11 ]
Schaffter, Thomas [12 ]
Pfaff, Emily [13 ]
Chute, Christopher G. [10 ]
Duong, Tim [14 ]
Haendel, Melissa A. [15 ]
Fuentes, Rafael [16 ]
Szolovits, Peter [17 ]
Xu, Hua [18 ]
Liu, Hongfang [1 ,18 ]
机构
[1] Mayo Clin, Dept Artificial Intelligence & Informat, Rochester, MN USA
[2] Tufts Med Ctr, Tufts Clin & Translat Sci Inst, Boston, MA USA
[3] Univ Kentucky, Dept Internal Med, Lexington, KY USA
[4] Univ Kansas, Dept Internal Med, Med Ctr, Kansas City, KS USA
[5] Univ Minnesota Twin Cities, Dept Pharmaceut Care Hlth Syst, Minneapolis, MN USA
[6] Univ Florida, Dept Pharmaceut Outcomes & Policy, Gainesville, FL USA
[7] Univ Alabama Birmingham, Dept Comp Sci, Birmingham, AL USA
[8] Univ Michigan, Dept Comp Med & Bioinformat, Med Sch, Ann Arbor, MI USA
[9] Wake Forest Sch Med, Dept Canc Biol, Winston Salem, NC USA
[10] Johns Hopkins Univ, Dept Med, Baltimore, MD USA
[11] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY USA
[12] Sage Bionetwork, Seattle, WA USA
[13] Univ North Carolina Chapel Hill, Dept Med, Chapel Hill, NC USA
[14] Albert Einstein Coll Med, Dept Radiol, Bronx, NY USA
[15] Univ Colorado, Ctr Hlth AI, Anschutz Med Campus, Denver, CO USA
[16] Alex Informat, North Bethesda, MD USA
[17] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA USA
[18] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Houston, TX USA
基金
美国国家卫生研究院;
关键词
electronic healthy records; natural language processing; federated learning; multi-institutional data annotation;
D O I
10.1093/jamia/ocad134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
引用
收藏
页码:2036 / 2040
页数:5
相关论文
共 50 条
  • [31] Outcomes of SARS-CoV-2 infection among patients with orthopaedic fracture surgery in the National COVID Cohort Collaborative (N3C)
    Levitt, Eli B.
    Patch, David A.
    Hess, Matthew C.
    Terrero, Alfredo
    Jaeger, Byron
    Haendel, Melissa A.
    Chute, Christopher G.
    Yeager, Matthew T.
    Ponce, Brent A.
    Theiss, Steven M.
    Spitler, Clay A.
    Johnson, Joey P.
    INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2023, 54 (12):
  • [32] Effect of SARS-CoV-2 Infection on Incident Diabetes by Viral Variant: Findings From the National COVID Cohort Collaborative (N3C)
    Wong, Rachel
    Hall, Margaret A.
    Wiggen, Talia
    Johnson, Steven G.
    Huling, Jared D.
    Turner, Lindsey E.
    Wilkins, Kenneth J.
    Yeh, Hsin-Chieh
    Stuermer, Til
    Bramante, Carolyn T.
    Buse, John B.
    Reusch, Jane
    DIABETES CARE, 2024, 47 (10) : 1846 - 1854
  • [33] Effect of menopausal hormone therapy on COVID-19 severe outcomes in women-A population-based study of the US National COVID Cohort Collaborative (N3C) data
    Yoshida, Yilin
    Chu, San
    Zu, Yuanhao
    Fox, Sarah
    Mauvais-Jarvis, Franck
    MATURITAS, 2023, 170 : 39 - 41
  • [34] The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records (vol 27, pg 1529, 2020)
    Henry, Sam
    Wang, Yanshan
    Shen, Feichen
    Uzuner, Ozlem
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (11) : 2546 - 2546
  • [35] Impact of Treatment of COVID-19 with Sotrovimab on Post-Acute COVID-19 Syndrome: An Analysis of National COVID Cohort Collaborative (N3C) Data
    Drysdale, Myriam
    Chang, Rose
    Guo, Tracy
    Gillespie, Iain A.
    Kalia, Sarah
    Duh, Mei Sheng
    Han, Jennifer
    Birch, Helen
    Sharpe, Catherine
    Liu, Daisy
    DerSarkissian, Maral
    Van Dyke, Melissa
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 194 - 195
  • [36] Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition
    Shen, Feichen
    Liu, Sijia
    Fu, Sunyang
    Wang, Yanshan
    Henry, Sam
    Uzuner, Ozlem
    Liu, Hongfang
    JMIR MEDICAL INFORMATICS, 2021, 9 (01)
  • [37] Use of hydroxychloroquine, remdesivir, and dexamethasone among adults hospitalized with COVID-19 in the United States: Results from the National COVID Cohort Collaborative (N3C)
    Mehta, Hemalkumar B.
    An, Huijun
    Andersen, Kathleen M.
    Mansour, Omar
    Madhira, Vithal
    Rashidi, Emaan S.
    Bates, Benjamin
    Setoguchi, Soko
    Joseph, Corey
    Kocis, Paul
    Moffitt, Richard
    Bennett, Tellen D.
    Chute, Christopher
    Garibaldi, Brian T.
    Alexander, G. Caleb
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 : 27 - 27
  • [38] Impact of treatment of COVID-19 with sotrovimab on post-acute sequelae of COVID-19 (PASC): an analysis of National COVID Cohort Collaborative (N3C) data
    Drysdale, Myriam
    Chang, Rose
    Guo, Tracy
    Duh, Mei Sheng
    Han, Jennifer
    Birch, Helen
    Sharpe, Catherine
    Liu, Daisy
    Kalia, Sarah
    Van Dyke, Melissa
    Dersarkissian, Maral
    Gillespie, Iain A.
    INFECTION, 2025,
  • [39] Association between Dexmedetomidine Use and Mortality in Patients with COVID-19 Receiving Invasive Mechanical Ventilation: A US National COVID Cohort Collaborative (N3C) Study
    Hamilton, John L.
    Baccile, Rachel
    Best, Thomas J.
    Desai, Pankaja
    Landay, Alan
    Rojas, Juan C.
    Wimmer, Markus A.
    Balk, Robert A.
    JOURNAL OF CLINICAL MEDICINE, 2024, 13 (12)
  • [40] Drug-drug interaction between dexamethasone and direct-acting oral anticoagulants: a nested case-control study in the National COVID Cohort Collaborative (N3C)
    Kravchenko, Olga, V
    Boyce, Richard D.
    Gomez-Lumbreras, Ainhoa
    Kocis, Paul T.
    Zapata, Lorenzo Villa
    Tan, Malinda
    Leonard, Charles E.
    Andersen, Kathleen M.
    Mehta, Hemalkumar
    Alexander, G. Caleb
    Malone, Daniel C.
    BMJ OPEN, 2022, 12 (12):