Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data

被引:15
|
作者
Mandair, Divneet [1 ]
Tiwari, Premanand [2 ]
Simon, Steven [3 ]
Colborn, Kathryn L. [4 ]
Rosenberg, Michael A. [1 ,3 ]
机构
[1] Univ Colorado, Sch Med, Div Internal Med, Aurora, CO 80309 USA
[2] Univ Colorado, Sch Med, Colorado Ctr Personalized Med, Aurora, CO USA
[3] Univ Colorado, Sch Med, Div Cardiol & Cardiac Electrophysiol, 12631 E 17th Ave,Mail Stop B130, Aurora, CO 80045 USA
[4] Univ Colorado, Sch Med, Dept Surg, Aurora, CO USA
关键词
Myocardial infarction; Machine learning; Electronic health records; CARDIOVASCULAR-DISEASE; MORTALITY; MODELS;
D O I
10.1186/s12911-020-01268-x
中图分类号
R-058 [];
学科分类号
摘要
Background With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only 'known' risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data. Methods Large-scale case-control study with outcome of 6-month incident MI, conducted using the top 800, from an initial 52 k procedures, diagnoses, and medications within the UCHealth system, harmonized to the Observational Medical Outcomes Partnership common data model, performed on 2.27 million patients. We compared several over- and under- sampling techniques to address the imbalance in the dataset. We compared regularized logistics regression, random forest, boosted gradient machines, and shallow and deep neural networks. A baseline model for comparison was a logistic regression using a limited set of 'known' risk factors for MI. Hyper-parameters were identified using 10-fold cross-validation. Results Twenty thousand Five hundred and ninety-one patients were diagnosed with MI compared with 2.25 million who did not. A deep neural network with random undersampling provided superior classification compared with other methods. However, the benefit of the deep neural network was only moderate, showing an F1 Score of 0.092 and AUC of 0.835, compared to a logistic regression model using only 'known' risk factors. Calibration for all models was poor despite adequate discrimination, due to overfitting from low frequency of the event of interest. Conclusions Our study suggests that DNN may not offer substantial benefit when trained on harmonized data, compared to traditional methods using established risk factors for MI.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Incident and recurrent myocardial infarction (MI) in relation to comorbidities: Prediction of outcomes using machine-learning algorithms
    Lip, Gregory Y. H.
    Genaidy, Ash
    Tran, George
    Marroquin, Patricia
    Estes, Cara
    Shnaiden, Tatiana
    Bayewitz, Ariel
    [J]. EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2022, 52 (08)
  • [32] Prediction of Sepsis in the Intensive Care Unit With Minimal Electronic Health Record Data: A Machine Learning Approach
    Desautels, Thomas
    Calvert, Jacob
    Hoffman, Jana
    Jay, Melissa
    Kerem, Yaniv
    Shieh, Lisa
    Shimabukuro, David
    Chettipally, Uli
    Feldman, Mitchell D.
    Barton, Chris
    Wales, David J.
    Das, Ritankar
    [J]. JMIR MEDICAL INFORMATICS, 2016, 4 (03) : 67 - 81
  • [33] Machine learning functional impairment classification with electronic health record data
    Pavon, Juliessa M.
    Previll, Laura
    Woo, Myung
    Henao, Ricardo
    Solomon, Mary
    Rogers, Ursula
    Olson, Andrew
    Fischer, Jonathan
    Leo, Christopher
    Fillenbaum, Gerda
    Hoenig, Helen
    Casarett, David
    [J]. JOURNAL OF THE AMERICAN GERIATRICS SOCIETY, 2023, 71 (09) : 2822 - 2833
  • [34] A machine learning-based approach for the prediction of periprocedural myocardial infarction by using routine data
    Wang, Yao
    Zhu, Kangjun
    Li, Ya
    Lv, Qingbo
    Fu, Guosheng
    Zhang, Wenbin
    [J]. CARDIOVASCULAR DIAGNOSIS AND THERAPY, 2020, 10 (05) : 1313 - 1324
  • [35] Predicting Postoperative Pain and Opioid Use with Machine Learning Applied to Longitudinal Electronic Health Record and Wearable Data
    Soley, Nidhi
    Speed, Traci J.
    Xie, Anping
    Taylor, Casey Overby
    [J]. APPLIED CLINICAL INFORMATICS, 2024, 15 (03): : 569 - 582
  • [37] Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment
    Mishra, Anup Kumar
    Chong, Bradford
    Arunachalam, Shivaram P.
    Oberg, Ann L.
    Majumder, Shounak
    [J]. AMERICAN JOURNAL OF GASTROENTEROLOGY, 2024, 119 (08): : 1466 - 1482
  • [38] Using Machine Learning Methods to Identify Predictors of Incident Myocardial Infarction in the Women'S Health Initiative Cohort
    Avram, Robert
    Tison, Geoff
    Nah, Gregory
    Howard, Barbara, V
    Olgin, Jeffrey
    Parikh, Nisha, I
    [J]. CIRCULATION, 2018, 138
  • [39] Predicting Intensive Care Unit Readmission with Machine Learning Using Electronic Health Record Data
    Rojas, J. C.
    Carey, K. A.
    Edelson, D. P.
    Venable, L. R.
    Howell, M. D.
    Churpek, M. M.
    [J]. AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2018, 197
  • [40] Using Natural Language Processing and Machine Learning to Identify Opioids in Electronic Health Record Data
    McDermott, Sean P.
    Wasan, Ajay D.
    [J]. JOURNAL OF PAIN RESEARCH, 2023, 16 : 2133 - 2140