Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data

被引:33
|
作者
Rasmy, Laila [1 ]
Nigo, Masayuki [2 ]
Kannadath, Bijun Sai [4 ]
Xie, Ziqian [1 ]
Mao, Bingyu [1 ]
Patel, Khush [1 ]
Zhou, Yujia [1 ]
Zhang, Wanheng [3 ]
Ross, Angela [1 ]
Xu, Hua [1 ]
Zhi, Degui [1 ]
机构
[1] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Houston, TX 77030 USA
[2] Univ Texas Hlth Sci Ctr Houston, McGovern Med Sch, Houston, TX 77030 USA
[3] Univ Texas Hlth Sci Ctr Houston, Sch Publ Hlth, Houston, TX USA
[4] Univ Arizona, Coll Med, Phoenix, AZ USA
来源
LANCET DIGITAL HEALTH | 2022年 / 4卷 / 06期
关键词
D O I
10.1016/S2589-7500(22)00049-8
中图分类号
R-058 [];
学科分类号
摘要
Background Predicting outcomes of patients with COVID-19 at an early stage is crucial for optimised clinical care and resource management, especially during a pandemic. Although multiple machine learning models have been proposed to address this issue, because of their requirements for extensive data preprocessing and feature engineering, they have not been validated or implemented outside of their original study site. Therefore, we aimed to develop accurate and transferrable predictive models of outcomes on hospital admission for patients with COVID-19. Methods In this study, we developed recurrent neural network-based models (CovRNN) to predict the outcomes of patients with COVID-19 by use of available electronic health record data on admission to hospital, without the need for specific feature selection or missing data imputation. CovRNN was designed to predict three outcomes: in-hospital mortality, need for mechanical ventilation, and prolonged hospital stay (>7 days). For in-hospital mortality and mechanical ventilation, CovRNN produced time-to-event risk scores (survival prediction; evaluated by the concordance index) and all-time risk scores (binary prediction; area under the receiver operating characteristic curve [AUROCJ was the main metric); we only trained a binary classification model for prolonged hospital stay. For binary classification tasks, we compared CovRNN against traditional machine learning algorithms: logistic regression and light gradient boost machine. Our models were trained and validated on the heterogeneous, deidentified data of 247 960 patients with COVID-19 from 87 US health-care systems derived from the Cerner Real-World COVID-19 Q3 Dataset up to September 2020. We held out the data of 4175 patients from two hospitals for external validation. The remaining 243 785 patients from the 85 health systems were grouped into training (n=170 626), validation (n=24378), and multihospital test (n=48 781) sets. Model performance was evaluated in the multi-hospital test set. The transferability of CovRNN was externally validated by use of deidentified data from 36 140 patients derived from the US-based Optum deidentified COVID-19 electronic health record dataset (version 1015; from January, 2007, to Oct 15, 2020). Exact dates of data extraction were masked by the databases to ensure patient data safety. Findings CovRNN binary models achieved AUROCs of 93.0% (95% CI 92.6-93.4) for the prediction of in-hospital mortality, 92.9% (92.6-93.2) for the prediction of mechanical ventilation, and 86.5% (86.2-86.9) for the prediction of a prolonged hospital stay, outperforming light gradient boost machine and logistic regression algorithms. External validation confirmed AUROCs in similar ranges (91.3-97-0% for in-hospital mortality prediction, 91.5-96.0% for the prediction of mechanical ventilation, and 81.0-88.3% for the prediction of prolonged hospital stay). For survival prediction, CovRNN achieved a concordance index of 86.0% (95% CI 85.1-86.9) for in-hospital mortality and 92.6% (92. 2-93-0) for mechanical ventilation. Interpretation Trained on a large, heterogeneous, real-world dataset, our CovRNN models showed high prediction accuracy and transferability through consistently good performances on multiple external datasets. Our results show the feasibility of a COVID-19 predictive model that delivers high accuracy without the need for complex feature engineering. Copyright (C) 2022 The Author(s). Published by Elsevier Ltd.
引用
收藏
页码:E415 / E425
页数:11
相关论文
共 50 条
  • [41] Visualization of Covid-19 pandemic influence on healthcare routines in dermatology using electronic health record data
    Wolf, J. Ryan
    Zhang, L.
    Xie, Y.
    Pentland, A.
    Pentland, B. T.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2023, 143 (05) : S119 - S119
  • [42] Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data
    Klann, Jeffrey G.
    Estiri, Hossein
    Weber, Griffin M.
    Moal, Bertrand
    Avillach, Paul
    Hong, Chuan
    Tan, Amelia L. M.
    Beaulieu-Jones, Brett K.
    Castro, Victor
    Maulhardt, Thomas
    Geva, Alon
    Malovini, Alberto
    South, Andrew M.
    Visweswaran, Shyam
    Morris, Michele
    Samayamuthu, Malarkodi J.
    Omenn, Gilbert S.
    Ngiam, Kee Yuan
    Mandl, Kenneth D.
    Boeker, Martin
    Olson, Karen L.
    Mowery, Danielle L.
    Follett, Robert W.
    Hanauer, David A.
    Bellazzi, Riccardo
    Moore, Jason H.
    Loh, Ne-Hooi Will
    Bell, Douglas S.
    Wagholikar, Kavishwar B.
    Chiovato, Luca
    Tibollo, Valentina
    Rieg, Siegbert
    Li, Anthony L. L. J.
    Jouhet, Vianney
    Schriver, Emily
    Xia, Zongqi
    Hutch, Meghan
    Luo, Yuan
    Kohane, Isaac S.
    Brat, Gabriel A.
    Murphy, Shawn N.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (07) : 1411 - 1420
  • [43] Using Multi-Modal Electronic Health Record Data for the Development and Validation of Risk Prediction Models for Long COVID Using the Super Learner Algorithm
    Jin, Weijia
    Hao, Wei
    Shi, Xu
    Fritsche, Lars G.
    Salvatore, Maxwell
    Admon, Andrew J.
    Friese, Christopher R.
    Mukherjee, Bhramar
    JOURNAL OF CLINICAL MEDICINE, 2023, 12 (23)
  • [44] Clinical Prediction Models for Hospital-Induced Delirium Using Structured and Unstructured Electronic Health Record Data: Protocol for a Development and Validation Study
    Ser, Sarah E.
    Shear, Kristen
    Snigurska, Urszula A.
    Prosperi, Mattia
    Wu, Yonghui
    Magoc, Tanja
    Bjarnadottir, Ragnhildur, I
    Lucero, Robert J.
    JMIR RESEARCH PROTOCOLS, 2023, 12
  • [45] Development and Validation of Machine Learning Models to Predict Bacteremia and Fungemia Using Electronic Health Record (EHR) Data
    Bhavani, S.
    Lonjers, Z.
    Carey, K.
    Gilbert, E. R.
    Afshar, M.
    Shah, N.
    Huang, E.
    Churpek, M. M.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2020, 201
  • [46] Predicting 30-Day Pneumonia Readmissions Using Electronic Health Record Data From The Full Hospital Stay: Model Development And Comparison
    Makam, A. N.
    Nguyen, O. K.
    Zhang, S.
    Xie, B.
    Weinreich, M. A.
    Amarasingham, R.
    Mortensen, E. M.
    Halm, E. A.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2016, 193
  • [47] PREDICTING 30-DAY PNEUMONIA READMISSIONS USING ELECTRONIC HEALTH RECORD DATA FROM THE FULL HOSPITAL STAY: MODEL DEVELOPMENT AND COMPARISON
    Nguyen, Oanh K.
    Makam, Anil N.
    Zhang, Song
    Xie, Bin
    Weinreich, Mark A.
    Amarasingham, Ruben
    Mortensen, Eric
    Halm, Ethan
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2016, 31 : S351 - S351
  • [48] Prediction of intensive care admission and hospital mortality in COVID-19 patients using demographics and baseline laboratory data
    Avelino-Silva, Vivian I.
    Avelino-Silva, Thiago J.
    Aliberti, Marlon J. R.
    Ferreira, Juliana C.
    Cobello Junior, Vilson
    Silva, Katia R.
    Pompeu, Jose E.
    Antonangelo, Leila
    Magri, Marcello M.
    Barros Filho, Tarcisio E. P.
    Souza, Heraldo P.
    Kallas, Esper G.
    CLINICS, 2023, 78
  • [49] Development and Validation of Algorithms to Identify COVID-19 Patients Using a US Electronic Health Records Database: A Retrospective Cohort Study
    Brown, Carolyn A.
    Londhe, Ajit A.
    He, Fang
    Cheng, Alvan
    Ma, Junjie
    Zhang, Jie
    Brooks, Corinne G.
    Sprafka, J. Michael
    Roehl, Kimberly A.
    Carlson, Katherine B.
    Page, John H.
    CLINICAL EPIDEMIOLOGY, 2022, 14 : 699 - 709
  • [50] Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach
    Li, Rumeng
    Hu, Baotian
    Liu, Feifan
    Liu, Weisong
    Cunningham, Francesca
    McManus, David D.
    Yu, Hong
    JMIR MEDICAL INFORMATICS, 2019, 7 (01)