Scalable and accurate deep learning with electronic health records

被引:1223
|
作者
Rajkomar, Alvin [1 ,2 ]
Oren, Eyal [1 ]
Chen, Kai [1 ]
Dai, Andrew M. [1 ]
Hajaj, Nissan [1 ]
Hardt, Michaela [1 ]
Liu, Peter J. [1 ]
Liu, Xiaobing [1 ]
Marcus, Jake [1 ]
Sun, Mimi [1 ]
Sundberg, Patrik [1 ]
Yee, Hector [1 ]
Zhang, Kun [1 ]
Zhang, Yi [1 ]
Flores, Gerardo [1 ]
Duggan, Gavin E. [1 ]
Irvine, Jamie [1 ]
Quoc Le [1 ]
Litsch, Kurt [1 ]
Mossin, Alexander [1 ]
Tansuwan, Justin [1 ]
Wang, De [1 ]
Wexler, James [1 ]
Wilson, Jimbo [1 ]
Ludwig, Dana [2 ]
Volchenboum, Samuel L. [3 ]
Chou, Katherine [1 ]
Pearson, Michael [1 ]
Madabushi, Srinivasan [1 ]
Shah, Nigam H. [4 ]
Butte, Atul J. [2 ]
Howell, Michael D. [1 ]
Cui, Claire [1 ]
Corrado, Greg S. [1 ]
Dean, Jeffrey [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Univ Calif San Francisco, San Francisco, CA 94143 USA
[3] Univ Chicago Med, Chicago, IL USA
[4] Stanford Univ, Stanford, CA 94305 USA
来源
NPJ DIGITAL MEDICINE | 2018年 / 1卷
关键词
RISK PREDICTION MODELS; EARLY WARNING SCORE; BIG DATA; HOSPITAL READMISSION; MEDICAL-RECORDS; VALIDATION; CARE; INPATIENT; ANALYTICS; PATIENT;
D O I
10.1038/s41746-018-0029-1
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Combining deep learning with token selection for patient phenotyping from electronic health records
    Yang, Zhen
    Dehmer, Matthias
    Yli-Harja, Olli
    Emmert-Streib, Frank
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [32] Deep Learning with Heterogeneous Graph Embeddings for Mortality Prediction from Electronic Health Records
    Wanyan, Tingyi
    Honarvar, Hossein
    Azad, Ariful
    Ding, Ying
    Glicksberg, Benjamin S.
    DATA INTELLIGENCE, 2021, 3 (03) : 329 - 339
  • [33] Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records
    Che, Zhengping
    Cheng, Yu
    Zha, Shuangfei
    Sun, Zhaonan
    Liu, Yan
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 787 - 792
  • [34] DDxNet: a deep learning model for automatic interpretation of electronic health records, electrocardiograms and electroencephalograms
    Thiagarajan, Jayaraman J.
    Rajan, Deepta
    Katoch, Sameeksha
    Spanias, Andreas
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [35] Deep representation learning for clustering longitudinal survival data from electronic health records
    Jiajun Qiu
    Yao Hu
    Li Li
    Abdullah Mesut Erzurumluoglu
    Ingrid Braenne
    Charles Whitehurst
    Jochen Schmitz
    Jatin Arora
    Boris Alexander Bartholdy
    Shrey Gandhi
    Pierre Khoueiry
    Stefanie Mueller
    Boris Noyvert
    Zhihao Ding
    Jan Nygaard Jensen
    Johann de Jong
    Nature Communications, 16 (1)
  • [36] Deep representation learning for individualized treatment effect estimation using electronic health records
    Chen, Peipei
    Dong, Wei
    Lu, Xudong
    Kaymak, Uzay
    He, Kunlun
    Huang, Zhengxing
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 100
  • [37] Research progress on electronic health records multimodal data fusion based on deep learning
    Fan, Yong
    Zhang, Zhengbo
    Wang, Jing
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2024, 41 (05): : 1062 - 1071
  • [38] Combining deep learning with token selection for patient phenotyping from electronic health records
    Zhen Yang
    Matthias Dehmer
    Olli Yli-Harja
    Frank Emmert-Streib
    Scientific Reports, 10
  • [39] Using deep learning and electronic health records to detect Noonan syndrome in pediatric patients
    Yang, Zeyu
    Shikany, Amy
    Ni, Yizhao
    Zhang, Ge
    Weaver, K. Nicole
    Chen, Jing
    GENETICS IN MEDICINE, 2022, 24 (11) : 2329 - 2337
  • [40] A deep learning approach for transgender and gender diverse patient identification in electronic health records
    Hua, Yining
    Wang, Liqin
    Nguyen, Vi
    Rieu-Werden, Meghan
    McDowell, Alex
    Bates, David W.
    Foer, Dinah
    Zhou, Li
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 147