Improved Cardiovascular Risk Prediction Using Nonparametric Regression and Electronic Health Record Data

被引:38
|
作者
Kennedy, Edward H. [1 ]
Wiitala, Wyndy L. [1 ]
Hayward, Rodney A. [1 ,2 ]
Sussman, Jeremy B. [1 ,2 ]
机构
[1] Ann Arbor VA Hlth Serv Res & Dev HSR&D Ctr Excell, VA Ctr Clin Management Res, Ann Arbor, MI USA
[2] Univ Michigan, Robert Wood Johnson Fdn Clin Scholars Program, Dept Internal Med, Ann Arbor, MI 48109 USA
关键词
cardiovascular disease; electronic health record; Framingham risk score; machine learning; nonparametric regression; risk prediction; HEART-DISEASE; INFORMATION-TECHNOLOGY; PREVENTION; STROKE; VALIDATION; ALGORITHMS; COMMITTEE; VETERANS; QUALITY; UPDATE;
D O I
10.1097/MLR.0b013e31827da594
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Use of the electronic health record (EHR) is expected to increase rapidly in the near future, yet little research exists on whether analyzing internal EHR data using flexible, adaptive statistical methods could improve clinical risk prediction. Extensive implementation of EHR in the Veterans Health Administration provides an opportunity for exploration. Objectives: To compare the performance of various approaches for predicting risk of cerebrovascular and cardiovascular (CCV) death, using traditional risk predictors versus more comprehensive EHR data. Research Design: Retrospective cohort study. We identified all Veterans Health Administration patients without recent CCV events treated at 12 facilities from 2003 to 2007, and predicted risk using the Framingham risk score, logistic regression, generalized additive modeling, and gradient tree boosting. Measures: The outcome was CCV-related death within 5 years. We assessed each method's predictive performance with the area under the receiver operating characteristic curve (AUC), the Hosmer-Lemeshow goodness-of-fit test, plots of estimated risk, and reclassification tables, using cross-validation to penalize overfitting. Results: Regression methods outperformed the Framingham risk score, even with the same predictors (AUC increased from 71% to 73% and calibration also improved). Even better performance was attained in models using additional EHR-derived predictor variables (AUC increased to 78% and net reclassification improvement was as large as 0.29). Nonparametric regression further improved calibration and discrimination compared with logistic regression. Conclusions: Despite the EHR lacking some risk factors and its imperfect data quality, health care systems may be able to substantially improve risk prediction for their patients by using internally developed EHR-derived models and flexible statistical methodology.
引用
收藏
页码:251 / 258
页数:8
相关论文
共 50 条
  • [1] Prediction of Atherosclerotic Cardiovascular Disease Risk Using Machine Learning and Electronic Health Record Data
    Ward, Andrew
    Sarraju, Ashish
    Chung, Sukyung
    Palaniappan, Latha
    Scheinker, David
    Rodriguez, Fatima
    [J]. CIRCULATION, 2019, 140
  • [2] Prediction of Recurrent Atherosclerotic Cardiovascular Disease Risk Using Machine Learning and Electronic Health Record Data
    Sarraju, Ashish
    Ward, Andrew
    Chung, Sukyung
    Li, Jiang
    Scheinker, David
    Rodriguez, Fatima
    [J]. CIRCULATION, 2020, 142
  • [3] Using Body Mass Index Data in the Electronic Health Record to Calculate Cardiovascular Risk
    Green, Beverly B.
    Anderson, Melissa L.
    Cook, Andrea J.
    Catz, Sheryl
    Fishman, Paul A.
    McClure, Jennifer B.
    Reid, Robert
    [J]. AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2012, 42 (04) : 342 - 347
  • [4] Use and Customization of Risk Scores for Predicting Cardiovascular Events Using Electronic Health Record Data
    Wolfson, Julian
    Vock, David M.
    Bandyopadhyay, Sunayan
    Kottke, Thomas
    Vazquez-Benitez, Gabriela
    Johnson, Paul
    Adomavicius, Gediminas
    O'Connor, Patrick J.
    [J]. JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2017, 6 (04):
  • [5] Validation of risk prediction models applied to longitudinal electronic health record data for the prediction of major cardiovascular events in the presence of data shifts
    Li, Yikuan
    Salimi-Khorshidi, Gholamreza
    Rao, Shishir
    Canoy, Dexter
    Hassaine, Abdelaali
    Lukasiewicz, Thomas
    Rahimi, Kazem
    Mamouei, Mohammad
    [J]. EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2022, 3 (04): : 535 - 547
  • [6] Development of An Individualized Risk Prediction Model for COVID-19 Using Electronic Health Record Data
    Mamidi, Tarun Karthik Kumar
    Tran-Nguyen, Thi K.
    Melvin, Ryan L.
    Worthey, Elizabeth A.
    [J]. FRONTIERS IN BIG DATA, 2021, 4
  • [7] Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction
    Zhao, Juan
    Feng, QiPing
    Wu, Patrick
    Lupu, Roxana A.
    Wilke, Russell A.
    Wells, Quinn S.
    Denny, Joshua C.
    Wei, Wei-Qi
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [8] Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction
    Juan Zhao
    QiPing Feng
    Patrick Wu
    Roxana A. Lupu
    Russell A. Wilke
    Quinn S. Wells
    Joshua C. Denny
    Wei-Qi Wei
    [J]. Scientific Reports, 9
  • [9] Comparing Machine Learning to Regression Methods for Mortality Prediction Using Veterans Affairs Electronic Health Record Clinical Data
    Jing, Bocheng
    Boscardin, W. John
    Deardorff, W. James
    Jeon, Sun Young
    Lee, Alexandra K.
    Donovan, Anne L.
    Lee, Sei J.
    [J]. MEDICAL CARE, 2022, 60 (06) : 470 - 479
  • [10] Prediction of Gastrointestinal Tract Cancers Using Longitudinal Electronic Health Record Data
    Read, Andrew J. J.
    Zhou, Wenjing
    Saini, Sameer D. D.
    Zhu, Ji
    Waljee, Akbar K. K.
    [J]. CANCERS, 2023, 15 (05)