Scalable and accurate deep learning with electronic health records

被引:1223
|
作者
Rajkomar, Alvin [1 ,2 ]
Oren, Eyal [1 ]
Chen, Kai [1 ]
Dai, Andrew M. [1 ]
Hajaj, Nissan [1 ]
Hardt, Michaela [1 ]
Liu, Peter J. [1 ]
Liu, Xiaobing [1 ]
Marcus, Jake [1 ]
Sun, Mimi [1 ]
Sundberg, Patrik [1 ]
Yee, Hector [1 ]
Zhang, Kun [1 ]
Zhang, Yi [1 ]
Flores, Gerardo [1 ]
Duggan, Gavin E. [1 ]
Irvine, Jamie [1 ]
Quoc Le [1 ]
Litsch, Kurt [1 ]
Mossin, Alexander [1 ]
Tansuwan, Justin [1 ]
Wang, De [1 ]
Wexler, James [1 ]
Wilson, Jimbo [1 ]
Ludwig, Dana [2 ]
Volchenboum, Samuel L. [3 ]
Chou, Katherine [1 ]
Pearson, Michael [1 ]
Madabushi, Srinivasan [1 ]
Shah, Nigam H. [4 ]
Butte, Atul J. [2 ]
Howell, Michael D. [1 ]
Cui, Claire [1 ]
Corrado, Greg S. [1 ]
Dean, Jeffrey [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Univ Calif San Francisco, San Francisco, CA 94143 USA
[3] Univ Chicago Med, Chicago, IL USA
[4] Stanford Univ, Stanford, CA 94305 USA
来源
NPJ DIGITAL MEDICINE | 2018年 / 1卷
关键词
RISK PREDICTION MODELS; EARLY WARNING SCORE; BIG DATA; HOSPITAL READMISSION; MEDICAL-RECORDS; VALIDATION; CARE; INPATIENT; ANALYTICS; PATIENT;
D O I
10.1038/s41746-018-0029-1
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Scalable and accurate deep learning with electronic health records
    Alvin Rajkomar
    Eyal Oren
    Kai Chen
    Andrew M. Dai
    Nissan Hajaj
    Michaela Hardt
    Peter J. Liu
    Xiaobing Liu
    Jake Marcus
    Mimi Sun
    Patrik Sundberg
    Hector Yee
    Kun Zhang
    Yi Zhang
    Gerardo Flores
    Gavin E. Duggan
    Jamie Irvine
    Quoc Le
    Kurt Litsch
    Alexander Mossin
    Justin Tansuwan
    De Wang
    James Wexler
    Jimbo Wilson
    Dana Ludwig
    Samuel L. Volchenboum
    Katherine Chou
    Michael Pearson
    Srinivasan Madabushi
    Nigam H. Shah
    Atul J. Butte
    Michael D. Howell
    Claire Cui
    Greg S. Corrado
    Jeffrey Dean
    npj Digital Medicine, 1
  • [2] Deep Learning for Electronic Health Records Analytics
    Harerimana, Gaspard
    Kim, Jong Wook
    Yoo, Hoon
    Jang, Beakcheol
    IEEE ACCESS, 2019, 7 : 101245 - 101259
  • [3] A Survey of Deep Learning for Electronic Health Records
    Xu, Jiabao
    Xi, Xuefeng
    Chen, Jie
    Sheng, Victor S.
    Ma, Jieming
    Cui, Zhiming
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [4] Deep Stable Representation Learning on Electronic Health Records
    Luo, Yingtao
    Liu, Zhaocheng
    Liu, Qiang
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1077 - 1082
  • [5] Predicting disease onset from electronic health records for population health management: a scalable and explainable Deep Learning approach
    Grout, Robert
    Gupta, Rishab
    Bryant, Ruby
    Elmahgoub, Mawada A.
    Li, Yijie
    Irfanullah, Khushbakht
    Patel, Rahul F.
    Fawkes, Jake
    Inness, Catherine
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 6
  • [6] Domain Knowledge Guided Deep Learning with Electronic Health Records
    Yin, Changchang
    Zhao, Rongjian
    Qian, Buyue
    Lv, Xin
    Zhang, Ping
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 738 - 747
  • [7] Readmission prediction using deep learning on electronic health records
    Ashfaq, Awais
    Sant'Anna, Anita
    Lingman, Markus
    Nowaczyk, Slawomir
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 97
  • [8] Deep learning detects and visualizes bleeding events in electronic health records
    Pedersen, Jannik S.
    Laursen, Martin S.
    Savarimuthu, Thiusius Rajeeth
    Hansen, Rasmus Sogaard
    Alnor, Anne Bryde
    Bjerre, Kristian Voss
    Kjaer, Ina Mathilde
    Gils, Charlotte
    Thorsen, Anne-Sofie Faarvang
    Andersen, Eline Sandvig
    Nielsen, Cathrine Brodsgaard
    Andersen, Lou-Ann Christensen
    Andreas, Soren
    Vinholt, Pernille Just
    RESEARCH AND PRACTICE IN THROMBOSIS AND HAEMOSTASIS, 2021, 5 (04)
  • [9] A Novel Deep Similarity Learning Approach to Electronic Health Records Data
    Gupta, Vagisha
    Sachdeva, Shelly
    Bhalla, Subhash
    IEEE ACCESS, 2020, 8 : 209278 - 209295
  • [10] Deep learning for electronic health records: A comparative review of multiple deep neural architectures
    Solares, Jose Roberto Ayala
    Raimondi, Francesca Elisa Diletta
    Zhu, Yajie
    Rahimian, Fatemeh
    Canoy, Dexter
    Tran, Jenny
    Gomes, Ana Catarina Pinho
    Payberah, Amir H.
    Zottoli, Mariagrazia
    Nazarzadeh, Milad
    Conrad, Nathalie
    Rahimi, Kazem
    Salimi-Khorshidi, Gholamreza
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 101