Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data

被引:15
|
作者
Mandair, Divneet [1 ]
Tiwari, Premanand [2 ]
Simon, Steven [3 ]
Colborn, Kathryn L. [4 ]
Rosenberg, Michael A. [1 ,3 ]
机构
[1] Univ Colorado, Sch Med, Div Internal Med, Aurora, CO 80309 USA
[2] Univ Colorado, Sch Med, Colorado Ctr Personalized Med, Aurora, CO USA
[3] Univ Colorado, Sch Med, Div Cardiol & Cardiac Electrophysiol, 12631 E 17th Ave,Mail Stop B130, Aurora, CO 80045 USA
[4] Univ Colorado, Sch Med, Dept Surg, Aurora, CO USA
关键词
Myocardial infarction; Machine learning; Electronic health records; CARDIOVASCULAR-DISEASE; MORTALITY; MODELS;
D O I
10.1186/s12911-020-01268-x
中图分类号
R-058 [];
学科分类号
摘要
Background With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only 'known' risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data. Methods Large-scale case-control study with outcome of 6-month incident MI, conducted using the top 800, from an initial 52 k procedures, diagnoses, and medications within the UCHealth system, harmonized to the Observational Medical Outcomes Partnership common data model, performed on 2.27 million patients. We compared several over- and under- sampling techniques to address the imbalance in the dataset. We compared regularized logistics regression, random forest, boosted gradient machines, and shallow and deep neural networks. A baseline model for comparison was a logistic regression using a limited set of 'known' risk factors for MI. Hyper-parameters were identified using 10-fold cross-validation. Results Twenty thousand Five hundred and ninety-one patients were diagnosed with MI compared with 2.25 million who did not. A deep neural network with random undersampling provided superior classification compared with other methods. However, the benefit of the deep neural network was only moderate, showing an F1 Score of 0.092 and AUC of 0.835, compared to a logistic regression model using only 'known' risk factors. Calibration for all models was poor despite adequate discrimination, due to overfitting from low frequency of the event of interest. Conclusions Our study suggests that DNN may not offer substantial benefit when trained on harmonized data, compared to traditional methods using established risk factors for MI.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Predicting Intensive Care Unit Readmission with Machine Learning Using Electronic Health Record Data
    Rojas, Juan C.
    Carey, Kyle A.
    Edelson, Dana P.
    Venable, Laura R.
    Howell, Michael D.
    Churpek, Matthew M.
    [J]. ANNALS OF THE AMERICAN THORACIC SOCIETY, 2018, 15 (07) : 846 - 853
  • [42] Delirium Prediction using Machine Learning Models on Preoperative Electronic Health Records Data
    Davoudi, Anis
    Ebadi, Ashkan
    Rashidi, Parisa
    Ozrazgat-Baslanti, Tazcan
    Bihorac, Azra
    Bursian, Alberto C.
    [J]. 2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 568 - 573
  • [43] Machine Learning Prediction of Kidney Stone Composition Using Electronic Health Record-Derived Features
    Abraham, Abin
    Kavoussi, Nicholas L.
    Sui, Wilson
    Bejan, Cosmin
    Capra, John A.
    Hsi, Ryan
    [J]. JOURNAL OF ENDOUROLOGY, 2022, 36 (02) : 243 - 250
  • [44] Learning About Machine Learning: The Promise and Pitfalls of Big Data and the Electronic Health Record
    Deo, Rahul C.
    Nallamothu, Brahmajee K.
    [J]. CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2016, 9 (06): : 618 - 620
  • [45] Challenges in Using Electronic Health Record Data for CER Experience of 4 Learning Organizations and Solutions Applied
    Bayley, K. Bruce
    Belnap, Tom
    Savitz, Lucy
    Masica, Andrew L.
    Shah, Nilay
    Fleming, Neil S.
    [J]. MEDICAL CARE, 2013, 51 (08) : S80 - S86
  • [46] Machine learning-based risk prediction model for canine myxomatous mitral valve disease using electronic health record data
    Kim, Yunji
    Kim, Jaejin
    Kim, Sehoon
    Youn, Hwayoung
    Choi, Jihye
    Seo, Kyoungwon
    [J]. FRONTIERS IN VETERINARY SCIENCE, 2023, 10
  • [47] Prediction of Acute Myocardial Infarction Using a Machine Learning-Based Approach From Data at Admission
    Park, Ji Young
    Noh, Yungkyun
    Choi, Byoung Geol
    Rha, Seung Woon
    [J]. JACC-CARDIOVASCULAR INTERVENTIONS, 2020, 13 (04) : S13 - S13
  • [48] Applications of Machine Learning on Electronic Health Record Data to Combat Antibiotic Resistance
    Blechman, Samuel E.
    Wright, Erik S.
    [J]. JOURNAL OF INFECTIOUS DISEASES, 2024,
  • [49] Identifying Stroke Patients At Risk For Atrial Fibrillation Using Electronic Health Record Data And Machine Learning
    Su, Tongli
    Hasan, S. M. Shafiul
    Nahab, Fadi B.
    Hu, Xiao
    [J]. STROKE, 2023, 54
  • [50] Classification of Current Procedural Terminology Codes from Electronic Health Record Data Using Machine Learning
    Burns, Michael L.
    Mathis, Michael R.
    Vandervest, John
    Tan, Xinyu
    Lu, Bo
    Colquhoun, Douglas A.
    Shah, Nirav
    Kheterpal, Sachin
    Saager, Leif
    [J]. ANESTHESIOLOGY, 2020, 132 (04) : 738 - 749