Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data

被引:15
|
作者
Mandair, Divneet [1 ]
Tiwari, Premanand [2 ]
Simon, Steven [3 ]
Colborn, Kathryn L. [4 ]
Rosenberg, Michael A. [1 ,3 ]
机构
[1] Univ Colorado, Sch Med, Div Internal Med, Aurora, CO 80309 USA
[2] Univ Colorado, Sch Med, Colorado Ctr Personalized Med, Aurora, CO USA
[3] Univ Colorado, Sch Med, Div Cardiol & Cardiac Electrophysiol, 12631 E 17th Ave,Mail Stop B130, Aurora, CO 80045 USA
[4] Univ Colorado, Sch Med, Dept Surg, Aurora, CO USA
关键词
Myocardial infarction; Machine learning; Electronic health records; CARDIOVASCULAR-DISEASE; MORTALITY; MODELS;
D O I
10.1186/s12911-020-01268-x
中图分类号
R-058 [];
学科分类号
摘要
Background With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only 'known' risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data. Methods Large-scale case-control study with outcome of 6-month incident MI, conducted using the top 800, from an initial 52 k procedures, diagnoses, and medications within the UCHealth system, harmonized to the Observational Medical Outcomes Partnership common data model, performed on 2.27 million patients. We compared several over- and under- sampling techniques to address the imbalance in the dataset. We compared regularized logistics regression, random forest, boosted gradient machines, and shallow and deep neural networks. A baseline model for comparison was a logistic regression using a limited set of 'known' risk factors for MI. Hyper-parameters were identified using 10-fold cross-validation. Results Twenty thousand Five hundred and ninety-one patients were diagnosed with MI compared with 2.25 million who did not. A deep neural network with random undersampling provided superior classification compared with other methods. However, the benefit of the deep neural network was only moderate, showing an F1 Score of 0.092 and AUC of 0.835, compared to a logistic regression model using only 'known' risk factors. Calibration for all models was poor despite adequate discrimination, due to overfitting from low frequency of the event of interest. Conclusions Our study suggests that DNN may not offer substantial benefit when trained on harmonized data, compared to traditional methods using established risk factors for MI.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data
    Divneet Mandair
    Premanand Tiwari
    Steven Simon
    Kathryn L. Colborn
    Michael A. Rosenberg
    [J]. BMC Medical Informatics and Decision Making, 20
  • [2] DEVELOPMENT OF A PREDICTION MODEL FOR INCIDENT MYOCARDIAL INFARCTION USING MACHINE LEARNING APPLIED TO HARMONIZED ELECTRONIC HEALTH RECORD DATA
    Mandair, Divneet
    Tiwari, Premanand
    Simon, Steven
    Rosenberg, Michael
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2020, 75 (11) : 194 - 194
  • [3] Assessment of a Machine Learning Model Applied to Harmonized Electronic Health Record Data for the Prediction of Incident Atrial Fibrillation
    Tiwari, Premanand
    Colborn, Kathryn L.
    Smith, Derek E.
    Xing, Fuyong
    Ghosh, Debashis
    Rosenberg, Michael A.
    [J]. JAMA NETWORK OPEN, 2020, 3 (01)
  • [4] Prediction of Drug-Induced Long QT Syndrome Using Machine Learning Applied to Harmonized Electronic Health Record Data
    Simon, Steven T.
    Mandair, Divneet
    Tiwari, Premanand
    Rosenberg, Michael A.
    [J]. JOURNAL OF CARDIOVASCULAR PHARMACOLOGY AND THERAPEUTICS, 2021, 26 (04) : 335 - 340
  • [5] Development of a Hypoglycemia Prediction Model for Veterans With Diabetes Using Supervised Machine Learning Applied to Electronic Health Record Data
    Raghavan, Sridharan
    Liu, Wenhui
    Baron, Anna
    Saxon, David
    Plomondon, Meg
    Ho, Michael
    Caplan, Liron
    [J]. CIRCULATION, 2020, 141
  • [6] Preoperative Prediction of Postoperative Infections Using Machine Learning and Electronic Health Record Data
    Zhuang, Yaxu
    Dyas, Adam
    Meguid, Robert A.
    Henderson, William G.
    Bronsert, Michael
    Madsen, Helen
    Colborn, Kathryn L.
    [J]. ANNALS OF SURGERY, 2024, 279 (04) : 720 - 726
  • [7] Postoperative delirium prediction using machine learning models and preoperative electronic health record data
    Andrew Bishara
    Catherine Chiu
    Elizabeth L. Whitlock
    Vanja C. Douglas
    Sei Lee
    Atul J. Butte
    Jacqueline M. Leung
    Anne L. Donovan
    [J]. BMC Anesthesiology, 22
  • [8] Prediction of Atherosclerotic Cardiovascular Disease Risk Using Machine Learning and Electronic Health Record Data
    Ward, Andrew
    Sarraju, Ashish
    Chung, Sukyung
    Palaniappan, Latha
    Scheinker, David
    Rodriguez, Fatima
    [J]. CIRCULATION, 2019, 140
  • [9] Postoperative delirium prediction using machine learning models and preoperative electronic health record data
    Bishara, Andrew
    Chiu, Catherine
    Whitlock, Elizabeth L.
    Douglas, Vanja C.
    Lee, Sei
    Butte, Atul J.
    Leung, Jacqueline M.
    Donovan, Anne L.
    [J]. BMC ANESTHESIOLOGY, 2022, 22 (01)
  • [10] Machine Learning Algorithm For Improving Classification Of Myocardial Infarction Across A Diverse Population Using US Veteran Electronic Health Record Data
    Cho, Kelly
    Link, Nicholas
    Schubert, Petra
    He, Zeling
    Honerlaw, Jacqueline P.
    Cai, Tianrun
    Orkaby, Ariela R.
    Qazi, Saadia
    Tanukonda, Vidisha
    Sun, Jiehuan
    Dahal, Kumar
    Galloway, Ashley
    Costa, Lauren
    Zhang, Yichi
    Gagnon, David R.
    Hong, Chuan
    Ho, Yuk-Lam
    Gaziano, J. M.
    Wilson, Peter W.
    Cai, Tianxi
    Liao, Katherine P.
    [J]. CIRCULATION, 2021, 143