Using machine learning to predict cardiovascular risk using self-reported questionnaires: Findings from the 45 and Up Study

被引:2
|
作者
Wang, Hongkuan [1 ]
Tucker, William J. [2 ]
Jonnagaddala, Jitendra [3 ]
Schutte, Aletta E. [3 ,4 ]
Jalaludin, Bin [3 ,5 ]
Rye, Kerry-Anne [2 ]
Liaw, Siaw-Teng [6 ]
Wong, Raymond K. [1 ,9 ]
Ong, Kwok Leung [2 ,7 ,8 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[2] Univ New South Wales, Sch Biomed Sci, Sydney, NSW, Australia
[3] Univ New South Wales, Sch Populat Hlth, Sydney, NSW, Australia
[4] George Inst Global Hlth, Sydney, NSW, Australia
[5] Univ New South Wales, Ingham Inst Appl Med Res, Sydney, Australia
[6] Univ New South Wales, WHO, Sch Populat Hlth, Collaborating Ctr ehlth, Sydney, NSW, Australia
[7] Univ Sydney, NHMRC Clin Trials Ctr, Med Fdn Bldg,92-94 Parramatta Rd, Camperdown, NSW 2050, Australia
[8] Room 134,Med Fdn Bldg,92-94 Parramatta Rd, Camperdown, NSW 2050, Australia
[9] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
基金
英国医学研究理事会;
关键词
Cardiovascular disease; Classification; Machine learning; Risk prediction; Survey; BODY-MASS INDEX; SOCIAL DETERMINANTS; POPULATION; DISEASES; MODELS;
D O I
10.1016/j.ijcard.2023.05.030
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Machine learning has been shown to outperform traditional statistical methods for risk prediction model development. We aimed to develop machine learning-based risk prediction models for cardiovascular mortality and hospitalisation for ischemic heart disease (IHD) using self-reported questionnaire data.Methods: The 45 and Up Study was a retrospective population-based study in New South Wales, Australia (2005-2009). Self-reported healthcare survey data on 187,268 participants without a history of cardiovascular disease was linked to hospitalisation and mortality data. We compared different machine learning algorithms, including traditional classification methods (support vector machine (SVM), neural network, random forest and logistic regression) and survival methods (fast survival SVM, Cox regression and random survival forest).Results: A total of 3687 participants experienced cardiovascular mortality and 12,841 participants had IHD-related hospitalisation over a median follow-up of 10.4 years and 11.6 years respectively. The best model for cardiovascular mortality was a Cox survival regression with L1 penalty at a re-sampled case/non-case ratio of 0.3 achieved by under-sampling of the non-cases. This model had the Uno's and Harrel's concordance indexes of 0.898 and 0.900 respectively. The best model for IHD hospitalisation was a Cox survival regression with L1 penalty at a re-sampled case/non-case ratio of 1.0 with Uno's and Harrel's concordance indexes of 0.711 and 0.718 respectively.Conclusion: Machine learning-based risk prediction models developed using self-reported questionnaire data had good prediction performance. These models may have the potential to be used in initial screening tests to identify high-risk individuals before undergoing costly investigation.
引用
收藏
页码:149 / 156
页数:8
相关论文
共 50 条
  • [11] Self-Reported Cardiovascular Disease and the Risk of Lung Cancer, the HUNT Study
    Hatlen, Peter
    Langhammer, Arnulf
    Carlsen, Sven Magnus
    Salvesen, Oyvind
    Amundsen, Tore
    JOURNAL OF THORACIC ONCOLOGY, 2014, 9 (07) : 940 - 946
  • [12] Validity of Self-Reported Cardiovascular Disease Risk From Survey Questions
    Duval, Sue
    Van't Hof, Jeremy
    Steffen, Lyn M.
    Luepker, Russell V.
    CIRCULATION, 2019, 139
  • [13] Concordance of Adherence Measurement Using Self-Reported Adherence Questionnaires and Medication Monitoring Devices
    Lizheng Shi
    Jinan Liu
    Yordanka Koleva
    Vivian Fonseca
    Anupama Kalsekar
    Manjiri Pawaskar
    PharmacoEconomics, 2010, 28 : 1097 - 1107
  • [14] Concordance of Adherence Measurement Using Self-Reported Adherence Questionnaires and Medication Monitoring Devices
    Shi, Lizheng
    Liu, Jinan
    Koleva, Yordanka
    Fonseca, Vivian
    Kalsekar, Anupama
    Pawaskar, Manjiri
    PHARMACOECONOMICS, 2010, 28 (12) : 1097 - 1107
  • [15] Self-reported questionnaires for lymphoedema: a systematic review of measurement properties using COSMIN framework
    Paramanandam, Vincent Singh
    Lee, Mi-Joung
    Kilbreath, Sharon L.
    Dylke, Elizabeth S.
    ACTA ONCOLOGICA, 2021, 60 (03) : 379 - 391
  • [16] Validation of maternal self-reported pregnancy complications using web-based questionnaires in a prospective cohort study
    Beekers, Pim
    Jamaladin, Hussein
    van Drongelen, Joris
    Roeleveld, Nel
    van Gelder, Marleen M. H. J.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2019, 28 : 358 - 359
  • [17] Self-reported and measured cardiorespiratory fitness similarly predict cardiovascular disease risk in young adults
    Ortega, F. B.
    Sanchez-Lopez, M.
    Solera-Martinez, M.
    Fernandez-Sanchez, A.
    Sjostrom, M.
    Martinez-Vizcaino, V.
    SCANDINAVIAN JOURNAL OF MEDICINE & SCIENCE IN SPORTS, 2013, 23 (06) : 749 - 757
  • [18] Using self-reported health-related quality of life to predict incident atherosclerotic cardiovascular disease events
    Pinheiro, Laura
    Reshetnyak, Evgeniya
    Sterling, Madeline
    Richman, Joshua
    Kern, Lisa
    Safford, Monika
    QUALITY OF LIFE RESEARCH, 2018, 27 : S80 - S80
  • [19] Objective Classification of mTBI Using Machine Learning on a Combination of Frontopolar Electroencephalography Measurements and Self-reported Symptoms
    M. Windy McNerney
    Thomas Hobday
    Betsy Cole
    Rick Ganong
    Nina Winans
    Dennis Matthews
    Jim Hood
    Stephen Lane
    Sports Medicine - Open, 2019, 5
  • [20] Objective Classification of mTBI Using Machine Learning on a Combination of Frontopolar Electroencephalography Measurements and Self-reported Symptoms
    McNerney, M. Windy
    Hobday, Thomas
    Cole, Betsy
    Ganong, Rick
    Winans, Nina
    Matthews, Dennis
    Hood, Jim
    Lane, Stephen
    SPORTS MEDICINE-OPEN, 2019, 5 (01)