An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)

被引:4
|
作者
Liu, Sijia [1 ]
Wen, Andrew [1 ]
Wang, Liwei [1 ]
He, Huan [1 ]
Fu, Sunyang [1 ]
Miller, Robert [2 ]
Williams, Andrew [2 ]
Harris, Daniel [3 ]
Kavuluru, Ramakanth [3 ]
Liu, Mei [4 ]
Abu-el-Rub, Noor [4 ]
Schutte, Dalton [5 ]
Zhang, Rui [5 ]
Rouhizadeh, Masoud [6 ]
Osborne, John D. [7 ]
He, Yongqun [8 ]
Topaloglu, Umit [9 ]
Hong, Stephanie S. [10 ]
Saltz, Joel H. [11 ]
Schaffter, Thomas [12 ]
Pfaff, Emily [13 ]
Chute, Christopher G. [10 ]
Duong, Tim [14 ]
Haendel, Melissa A. [15 ]
Fuentes, Rafael [16 ]
Szolovits, Peter [17 ]
Xu, Hua [18 ]
Liu, Hongfang [1 ,18 ]
机构
[1] Mayo Clin, Dept Artificial Intelligence & Informat, Rochester, MN USA
[2] Tufts Med Ctr, Tufts Clin & Translat Sci Inst, Boston, MA USA
[3] Univ Kentucky, Dept Internal Med, Lexington, KY USA
[4] Univ Kansas, Dept Internal Med, Med Ctr, Kansas City, KS USA
[5] Univ Minnesota Twin Cities, Dept Pharmaceut Care Hlth Syst, Minneapolis, MN USA
[6] Univ Florida, Dept Pharmaceut Outcomes & Policy, Gainesville, FL USA
[7] Univ Alabama Birmingham, Dept Comp Sci, Birmingham, AL USA
[8] Univ Michigan, Dept Comp Med & Bioinformat, Med Sch, Ann Arbor, MI USA
[9] Wake Forest Sch Med, Dept Canc Biol, Winston Salem, NC USA
[10] Johns Hopkins Univ, Dept Med, Baltimore, MD USA
[11] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY USA
[12] Sage Bionetwork, Seattle, WA USA
[13] Univ North Carolina Chapel Hill, Dept Med, Chapel Hill, NC USA
[14] Albert Einstein Coll Med, Dept Radiol, Bronx, NY USA
[15] Univ Colorado, Ctr Hlth AI, Anschutz Med Campus, Denver, CO USA
[16] Alex Informat, North Bethesda, MD USA
[17] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA USA
[18] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Houston, TX USA
基金
美国国家卫生研究院;
关键词
electronic healthy records; natural language processing; federated learning; multi-institutional data annotation;
D O I
10.1093/jamia/ocad134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
引用
收藏
页码:2036 / 2040
页数:5
相关论文
共 50 条
  • [21] Association of glycemic control with Long COVID in patients with type 2 diabetes: findings from the National COVID Cohort Collaborative (N3C)
    Soff, Samuel
    Yoo, Yun Jae
    Bramante, Carolyn
    Reusch, Jane E. B.
    Huling, Jared Davis
    Hall, Margaret A.
    Brannock, Daniel
    Sturmer, Til
    Butzin-Dozier, Zachary
    Wong, Rachel
    Moffitt, Richard
    BMJ OPEN DIABETES RESEARCH & CARE, 2025, 13 (01)
  • [22] Data quality considerations for evaluating COVID-19 treatments using real world data: learnings from the National COVID Cohort Collaborative (N3C)
    Sidky, Hythem C.
    Young, Jessica C. T.
    Girvin, Andrew T.
    Lee, Eileen
    Shao, Yu Raymond
    Hotaling, Nathan
    Michael, Sam J.
    Wilkins, Kenneth J.
    Setoguchi, Soko
    Funk, Michele Jonsson
    BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [23] Data quality considerations for evaluating COVID-19 treatments using real world data: learnings from the National COVID Cohort Collaborative (N3C)
    Hythem Sidky
    Jessica C. Young
    Andrew T. Girvin
    Eileen Lee
    Yu Raymond Shao
    Nathan Hotaling
    Sam Michael
    Kenneth J. Wilkins
    Soko Setoguchi
    Michele Jonsson Funk
    BMC Medical Research Methodology, 23
  • [24] Utilizing the National COVID Cohort Collaborative (N3C) to evaluate risk of serious outcomes with COVID-19 among chronically immunosuppressed persons
    Andersen, Kathleen M.
    Rashidi, Emaan S.
    An, Huijun
    Mehta, Hemalkumar B.
    Ng, Derek K.
    Garibaldi, Brian T.
    Segal, Jodi B.
    Alexander, G. Caleb
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 : 24 - 24
  • [25] Preexisting Autoimmunity Is Associated With Increased Severity of Coronavirus Disease 2019: A Retrospective Cohort Study Using Data From the National COVID Cohort Collaborative (N3C)
    Yadaw, Arjun S.
    Sahner, David K.
    Sidky, Hythem
    Afzali, Behdad
    Hotaling, Nathan
    Pfaff, Emily R.
    Mathe, Ewy A.
    CLINICAL INFECTIOUS DISEASES, 2023, 77 (06) : 816 - 826
  • [26] REAL-WORLD EFFECTIVENESS OF SOTROVIMAB FOR THE EARLY TREATMENT OF COVID-19: EVIDENCE FROM THE NATIONAL COVID COHORT COLLABORATIVE (N3C)
    Bell, C.
    Bobbili, P.
    Desai, R.
    Gibbons, D.
    Patel, V
    DerSarkissian, M.
    Drysdale, M.
    Birch, H.
    Lloyd, E.
    Zhang, A.
    Duh, M. S.
    VALUE IN HEALTH, 2023, 26 (06) : S33 - S33
  • [27] Nirmatrelvir/Ritonavir (Paxlovid) Use Among Individuals at Risk of Severe COVID-19: An Analysis of the National Covid Cohort Collaborative (N3C)
    Xiao, Xuya
    Alexander, G. Caleb
    Mehta, Hemalkumar B.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 198 - 199
  • [28] Nirmatrelvir/Ritonavir (Paxlovid) Use Among Individuals at Risk of Severe COVID-19: An Analysis of the National COVID Cohort Collaborative (N3C)
    Xiao, Xuya
    Alexander, G. Caleb
    Mehta, Hemalkumar B.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 (08)
  • [29] Real-World Effectiveness of Sotrovimab for the Early Treatment of COVID-19: Evidence from the US National COVID Cohort Collaborative (N3C)
    Christopher F. Bell
    Priyanka Bobbili
    Raj Desai
    Daniel C. Gibbons
    Myriam Drysdale
    Maral DerSarkissian
    Vishal Patel
    Helen J. Birch
    Emily J. Lloyd
    Adina Zhang
    Mei Sheng Duh
    Clinical Drug Investigation, 2024, 44 (3) : 183 - 198
  • [30] Real-World Effectiveness of Sotrovimab for the Early Treatment of COVID-19: Evidence from the US National COVID Cohort Collaborative (N3C)
    Bell, Christopher F.
    Bobbili, Priyanka
    Desai, Raj
    Gibbons, Daniel C.
    Drysdale, Myriam
    Dersarkissian, Maral
    Patel, Vishal
    Birch, Helen J.
    Lloyd, Emily J.
    Zhang, Adina
    Duh, Mei Sheng
    CLINICAL DRUG INVESTIGATION, 2024, 44 (03) : 183 - 198