Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative

被引:16
|
作者
Zang, Chengxi [1 ]
Zhang, Yongkang [1 ]
Xu, Jie [2 ]
Bian, Jiang [2 ]
Morozyuk, Dmitry [1 ]
Schenck, Edward J. [3 ]
Khullar, Dhruv [1 ]
Nordvig, Anna S. [4 ]
Shenkman, Elizabeth A. [2 ]
Rothman, Russell L. [5 ]
Block, Jason P. [6 ]
Lyman, Kristin [7 ]
Weiner, Mark G. [1 ]
Carton, Thomas W. [7 ]
Wang, Fei [1 ]
Kaushal, Rainu [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Univ Florida, Dept Hlth Outcomes Biomed Informat, Gainesville, FL USA
[3] Weill Cornell Med, Dept Med, Div Pulm & Crit Care Med, New York, NY USA
[4] Weill Cornell Med, Dept Neurol, New York, NY USA
[5] Vanderbilt Univ, Ctr Hlth Serv Res, Med Ctr, Nashville, TN USA
[6] Harvard Med Sch, Harvard Pilgrim Hlth Care Inst, Dept Populat Med, Boston, MA USA
[7] Louisiana Publ Hlth Inst, New Orleans, LA USA
基金
美国国家卫生研究院;
关键词
D O I
10.1038/s41467-023-37653-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this study, the authors characterise post-acute sequelae of SARS-CoV-2 (PASC) in two large cohorts based on electronic health records from the USA. They identify a broad range of PASC-related conditions which were only partially replicated across the two cohorts, indicating possible heterogeneity between populations. Recent studies have investigated post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) using real-world patient data such as electronic health records (EHR). Prior studies have typically been conducted on patient cohorts with specific patient populations which makes their generalizability unclear. This study aims to characterize PASC using the EHR data warehouses from two large Patient-Centered Clinical Research Networks (PCORnet), INSIGHT and OneFlorida+, which include 11 million patients in New York City (NYC) area and 16.8 million patients in Florida respectively. With a high-throughput screening pipeline based on propensity score and inverse probability of treatment weighting, we identified a broad list of diagnoses and medications which exhibited significantly higher incidence risk for patients 30-180 days after the laboratory-confirmed SARS-CoV-2 infection compared to non-infected patients. We identified more PASC diagnoses in NYC than in Florida regarding our screening criteria, and conditions including dementia, hair loss, pressure ulcers, pulmonary fibrosis, dyspnea, pulmonary embolism, chest pain, abnormal heartbeat, malaise, and fatigue, were replicated across both cohorts. Our analyses highlight potentially heterogeneous risks of PASC in different populations.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative
    Chengxi Zang
    Yongkang Zhang
    Jie Xu
    Jiang Bian
    Dmitry Morozyuk
    Edward J. Schenck
    Dhruv Khullar
    Anna S. Nordvig
    Elizabeth A. Shenkman
    Russell L. Rothman
    Jason P. Block
    Kristin Lyman
    Mark G. Weiner
    Thomas W. Carton
    Fei Wang
    Rainu Kaushal
    [J]. Nature Communications, 14
  • [2] Data-driven modeling of clinical pathways using electronic health records
    Funkner, Anastasia A.
    Yakovlev, Aleksey N.
    Kovalchuk, Sergey V.
    [J]. CENTERIS 2017 - INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS / PROJMAN 2017 - INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT / HCIST 2017 - INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, CENTERI, 2017, 121 : 835 - 842
  • [3] Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
    Rachel D Melamed
    Hossein Khiabanian
    Raul Rabadan
    [J]. BMC Bioinformatics, 15
  • [4] Data-driven identification of ageing-related diseases from electronic health records
    Kuan, Valerie
    Fraser, Helen C.
    Hingorani, Melanie
    Denaxas, Spiros
    Gonzalez-Izquierdo, Arturo
    Direk, Kenan
    Nitsch, Dorothea
    Mathur, Rohini
    Parisinos, Constantinos A.
    Lumbers, R. Thomas
    Sofat, Reecha
    Wong, Ian C. K.
    Casas, Juan P.
    Thornton, Janet M.
    Hemingway, Harry
    Partridge, Linda
    Hingorani, Aroon D.
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [5] Data-driven identification of ageing-related diseases from electronic health records
    Valerie Kuan
    Helen C. Fraser
    Melanie Hingorani
    Spiros Denaxas
    Arturo Gonzalez-Izquierdo
    Kenan Direk
    Dorothea Nitsch
    Rohini Mathur
    Constantinos A. Parisinos
    R. Thomas Lumbers
    Reecha Sofat
    Ian C. K. Wong
    Juan P. Casas
    Janet M. Thornton
    Harry Hemingway
    Linda Partridge
    Aroon D. Hingorani
    [J]. Scientific Reports, 11
  • [6] Data-driven discovery of seasonally linked diseases from an Electronic Health Records system
    Melamed, Rachel D.
    Khiabanian, Hossein
    Rabadan, Raul
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [7] Identifying data-driven subtypes of major depressive disorder with electronic health records
    Sharma, Abhishek
    Verhaak, Pilar F.
    McCoy, Thomas H.
    Perlis, Roy H.
    Doshi-Velez, Finale
    [J]. JOURNAL OF AFFECTIVE DISORDERS, 2024, 356 : 64 - 70
  • [8] Data-Driven Information Extraction from Chinese Electronic Medical Records
    Xu, Dong
    Zhang, Meizhuo
    Zhao, Tianwan
    Ge, Chen
    Gao, Weiguo
    Wei, Jia
    Zhu, Kenny Q.
    [J]. PLOS ONE, 2015, 10 (08):
  • [9] Data-driven identification of heart failure disease states and progression pathways using electronic health records
    Nagamine, Tasha
    Gillette, Brian
    Kahoun, John
    Burghaus, Rolf
    Lippert, Jorg
    Saxena, Mayur
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [10] Data-driven identification of heart failure disease states and progression pathways using electronic health records
    Tasha Nagamine
    Brian Gillette
    John Kahoun
    Rolf Burghaus
    Jörg Lippert
    Mayur Saxena
    [J]. Scientific Reports, 12