Individual risk prediction: Comparing random forests with Cox proportional-hazards model by a simulation study

被引:11
|
作者
Baralou, Valia [1 ]
Kalpourtzi, Natasa [1 ]
Touloumi, Giota [1 ]
机构
[1] Natl & Kapodistrian Univ Athens, Med Sch, Dept Hyg Epidemiol & Med Stat, Athens 11527, Greece
关键词
Cox model; machine learning; random survival forest; survival analysis; RANDOM SURVIVAL FORESTS; CARDIOVASCULAR-DISEASE; LIFE-STYLE; REGRESSION; SCORE;
D O I
10.1002/bimj.202100380
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With big data becoming widely available in healthcare, machine learning algorithms such as random forest (RF) that ignores time-to-event information and random survival forest (RSF) that handles right-censored data are used for individual risk prediction alternatively to the Cox proportional hazards (Cox-PH) model. We aimed to systematically compare RF and RSF with Cox-PH. RSF with three split criteria [log-rank (RSF-LR), log-rank score (RSF-LRS), maximally selected rank statistics (RSF-MSR)]; RF, Cox-PH, and Cox-PH with splines (Cox-S) were evaluated through a simulation study based on real data. One hundred eighty scenarios were investigated assuming different associations between the predictors and the outcome (linear/linear and interactions/nonlinear/nonlinear and interactions), training sample sizes (500/1000/5000), censoring rates (50%/75%/93%), hazard functions (increasing/decreasing/constant), and number of predictors (seven, 15 including noise variables). Methods' performance was evaluated with time-dependent area under curve and integrated Brier score. In all scenarios, RF had the worst performance. In scenarios with a low number of events (<= 70), Cox-PH was at least noninferior to RSF, whereas under linearity assumption it outperformed RSF. Under the presence of interactions, RSF performed better than Cox-PH as the number of events increased whereas Cox-S reached at least similar performance with RSF under nonlinear effects. RSF-LRS performed slightly worse than RSF-LR and RSF-MSR when including noise variables and interaction effects. When applied to real data, models incorporating survival time performed better. Although RSF algorithms are a promising alternative to conventional Cox-PH as data complexity increases, they require a higher number of events for training. In time-to-event analysis, algorithms that consider survival time should be used.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Combining symbolic regression with the Cox proportional hazards model improves prediction of heart failure deaths
    Casper Wilstrup
    Chris Cave
    BMC Medical Informatics and Decision Making, 22
  • [32] Prediction of Bladder Cancer Prognosis by Deep Cox Proportional Hazards Model Based on Adversarial Autoencoder
    Wu, Jing
    Ren, Yanqiong
    Han, Fei
    Bao, Xiang
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT I, ICIC 2024, 2024, 14881 : 123 - 134
  • [33] Combining symbolic regression with the Cox proportional hazards model improves prediction of heart failure deaths
    Wilstrup, Casper
    Cave, Chris
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [34] RHEUMATOID ARTHRITIS MIGHT NOT BE A RISK FACTOR FOR OSTEOPOROTIC FRACTURES IN NEW ERA; 10-YEAR COHORT TOMORROW STUDY ANALYZED BY COX PROPORTIONAL-HAZARDS REGRESSION MODEL WITH TIME-DEPENDENT VARIABLES
    Koike, T.
    Sugioka, Y.
    Okano, T.
    Tada, M.
    Mamoto, K.
    Inui, K.
    Takahashi, K.
    AGING CLINICAL AND EXPERIMENTAL RESEARCH, 2023, 35 : S180 - S180
  • [35] A simulation study of finite-sample properties of marginal structural Cox proportional hazards models
    Westreich, Daniel
    Cole, Stephen R.
    Schisterman, Enrique F.
    Platt, Robert W.
    STATISTICS IN MEDICINE, 2012, 31 (19) : 2098 - 2109
  • [36] Assessing the prediction accuracy of cure in the Cox proportional hazards cure model: an application to breast cancer data
    Asano, Junichi
    Hirakawa, Akihiro
    Hamada, Chikuma
    PHARMACEUTICAL STATISTICS, 2014, 13 (06) : 357 - 363
  • [37] CONFIDENCE-REGIONS FOR PARAMETERS OF THE PROPORTIONAL HAZARDS MODEL - A SIMULATION STUDY
    MOOLGAVKAR, SH
    VENZON, DJ
    SCANDINAVIAN JOURNAL OF STATISTICS, 1987, 14 (01) : 43 - 56
  • [38] Factors affecting the survival of prediabetic patients: comparison of Cox proportional hazards model and random survival forest method
    Sharafi, Mehdi
    Mohsenpour, Mohammad Ali
    Afrashteh, Sima
    Eftekhari, Mohammad Hassan
    Dehghan, Azizallah
    Farhadi, Akram
    Jafarnezhad, Aboubakr
    Zakeri, Abdoljabbar
    Looha, Mehdi Azizmohammad
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [39] Predictors for independent external validation of cardiovascular risk clinical prediction rules: Cox proportional hazards regression analyses
    Jong-Wook Ban
    Richard Stevens
    Rafael Perera
    Diagnostic and Prognostic Research, 2 (1)
  • [40] Moving beyond the Cox proportional hazards model in survival data analysis: a cervical cancer study
    Li, Lixian
    Yang, Zijing
    Hou, Yawen
    Chen, Zheng
    BMJ OPEN, 2020, 10 (07):