Individual risk prediction: Comparing random forests with Cox proportional-hazards model by a simulation study

被引:11
|
作者
Baralou, Valia [1 ]
Kalpourtzi, Natasa [1 ]
Touloumi, Giota [1 ]
机构
[1] Natl & Kapodistrian Univ Athens, Med Sch, Dept Hyg Epidemiol & Med Stat, Athens 11527, Greece
关键词
Cox model; machine learning; random survival forest; survival analysis; RANDOM SURVIVAL FORESTS; CARDIOVASCULAR-DISEASE; LIFE-STYLE; REGRESSION; SCORE;
D O I
10.1002/bimj.202100380
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With big data becoming widely available in healthcare, machine learning algorithms such as random forest (RF) that ignores time-to-event information and random survival forest (RSF) that handles right-censored data are used for individual risk prediction alternatively to the Cox proportional hazards (Cox-PH) model. We aimed to systematically compare RF and RSF with Cox-PH. RSF with three split criteria [log-rank (RSF-LR), log-rank score (RSF-LRS), maximally selected rank statistics (RSF-MSR)]; RF, Cox-PH, and Cox-PH with splines (Cox-S) were evaluated through a simulation study based on real data. One hundred eighty scenarios were investigated assuming different associations between the predictors and the outcome (linear/linear and interactions/nonlinear/nonlinear and interactions), training sample sizes (500/1000/5000), censoring rates (50%/75%/93%), hazard functions (increasing/decreasing/constant), and number of predictors (seven, 15 including noise variables). Methods' performance was evaluated with time-dependent area under curve and integrated Brier score. In all scenarios, RF had the worst performance. In scenarios with a low number of events (<= 70), Cox-PH was at least noninferior to RSF, whereas under linearity assumption it outperformed RSF. Under the presence of interactions, RSF performed better than Cox-PH as the number of events increased whereas Cox-S reached at least similar performance with RSF under nonlinear effects. RSF-LRS performed slightly worse than RSF-LR and RSF-MSR when including noise variables and interaction effects. When applied to real data, models incorporating survival time performed better. Although RSF algorithms are a promising alternative to conventional Cox-PH as data complexity increases, they require a higher number of events for training. In time-to-event analysis, algorithms that consider survival time should be used.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Evaluation of a two-part regression calibration to adjust for dietary exposure measurement error in the Cox proportional hazards model: A simulation study
    Agogo, George O.
    van der Voet, Hilko
    van't Veer, Pieter
    van Eeuwijk, Fred A.
    Boshuizen, Hendriek C.
    BIOMETRICAL JOURNAL, 2016, 58 (04) : 766 - 782
  • [42] Random-effects cox proportional hazards model: General variance components methods for time-to-event data
    Pankratz, VS
    de Andrade, M
    Therneau, TM
    GENETIC EPIDEMIOLOGY, 2005, 28 (02) : 97 - 109
  • [43] A cohort study of risk factors for the development of psychosis in Parkinson disease using Cox proportional hazards models
    Yamamoto, K.
    Oeda, T.
    Sawada, H.
    EUROPEAN JOURNAL OF NEUROLOGY, 2009, 16 : 611 - 611
  • [44] Metabolomics signatures associated with fracture prediction; cox proportional hazard model and random forest survival analysis
    Jeong, Sohyun
    Okoro, Paul
    Berry, Sarah
    Kiel, Douglas
    Hsu, Yi-Hsiang
    JOURNAL OF BONE AND MINERAL RESEARCH, 2023, 38 : 152 - 152
  • [45] Choline-to-N-acetyl aspartate and lipids-lactate-to-creatine ratios together with age assemble a significant Cox's proportional-hazards regression model for prediction of survival in high-grade gliomas
    Roldan-Valadez, Ernesto
    Rios, Camilo
    Motola-Kuba, Daniel
    Matus-Santos, Juan
    Villa, Antonio R.
    Moreno-Jimenez, Sergio
    BRITISH JOURNAL OF RADIOLOGY, 2016, 89 (1067):
  • [46] Age, choline-to-N-acetyl aspartate, and lipids-lactate-to-creatine ratios assemble a significant Cox's proportional-hazards regression model for survival prediction in patients with high-grade gliomas
    Liu, Zhenyin
    Zhang, Jing
    BRITISH JOURNAL OF RADIOLOGY, 2017, 90 (1075):
  • [47] SIMULATION PROGRAM FOR ESTIMATING STATISTICAL POWER OF COX PROPORTIONAL HAZARDS MODEL ASSUMING NO SPECIFIC DISTRIBUTION FOR THE SURVIVAL-TIME
    AKAZAWA, K
    NAKAMURA, T
    MORIGUCHI, S
    SHIMADA, M
    NOSE, Y
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1991, 35 (03) : 203 - 212
  • [48] Linking the Random Forests Model and GIS to Assess Geo-Hazards Risk: A Case Study in Shifang County, China
    Huang, Pei
    Peng, Li
    Pan, Hongyi
    IEEE ACCESS, 2020, 8 : 28033 - 28042
  • [49] Risk reduction derived fromthe Cox proportional hazard model - A simulation study using real-world data -
    Takeishi, S.
    Inoue, T.
    DIABETES RESEARCH AND CLINICAL PRACTICE, 2024, 209
  • [50] The Study of Risk Factors for Breast Cancer Risk with Cox-proportional Hazard Regression Model
    Liu, Zhuolin
    Yang, Qiyin
    2016 2ND INTERNATIONAL CONFERENCE ON ENVIRONMENTAL POLLUTION AND PUBLIC HEALTH (EPPH 2016), 2016, 8 : 46 - 51