Individual risk prediction: Comparing random forests with Cox proportional-hazards model by a simulation study

被引:11
|
作者
Baralou, Valia [1 ]
Kalpourtzi, Natasa [1 ]
Touloumi, Giota [1 ]
机构
[1] Natl & Kapodistrian Univ Athens, Med Sch, Dept Hyg Epidemiol & Med Stat, Athens 11527, Greece
关键词
Cox model; machine learning; random survival forest; survival analysis; RANDOM SURVIVAL FORESTS; CARDIOVASCULAR-DISEASE; LIFE-STYLE; REGRESSION; SCORE;
D O I
10.1002/bimj.202100380
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With big data becoming widely available in healthcare, machine learning algorithms such as random forest (RF) that ignores time-to-event information and random survival forest (RSF) that handles right-censored data are used for individual risk prediction alternatively to the Cox proportional hazards (Cox-PH) model. We aimed to systematically compare RF and RSF with Cox-PH. RSF with three split criteria [log-rank (RSF-LR), log-rank score (RSF-LRS), maximally selected rank statistics (RSF-MSR)]; RF, Cox-PH, and Cox-PH with splines (Cox-S) were evaluated through a simulation study based on real data. One hundred eighty scenarios were investigated assuming different associations between the predictors and the outcome (linear/linear and interactions/nonlinear/nonlinear and interactions), training sample sizes (500/1000/5000), censoring rates (50%/75%/93%), hazard functions (increasing/decreasing/constant), and number of predictors (seven, 15 including noise variables). Methods' performance was evaluated with time-dependent area under curve and integrated Brier score. In all scenarios, RF had the worst performance. In scenarios with a low number of events (<= 70), Cox-PH was at least noninferior to RSF, whereas under linearity assumption it outperformed RSF. Under the presence of interactions, RSF performed better than Cox-PH as the number of events increased whereas Cox-S reached at least similar performance with RSF under nonlinear effects. RSF-LRS performed slightly worse than RSF-LR and RSF-MSR when including noise variables and interaction effects. When applied to real data, models incorporating survival time performed better. Although RSF algorithms are a promising alternative to conventional Cox-PH as data complexity increases, they require a higher number of events for training. In time-to-event analysis, algorithms that consider survival time should be used.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Financial Risk Prediction Based on Stochastic Block and Cox Proportional Hazards Models
    Sun, Xiaokun
    Yang, Jieru
    Yao, Junya
    Sun, Qian
    Su, Yong
    Xu, Hengpeng
    Wang, Jun
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, VOL. 1, 2022, 878 : 548 - 556
  • [22] The Cox proportional hazards model with a partly linear relative risk function
    Heller, G
    LIFETIME DATA ANALYSIS, 2001, 7 (03) : 255 - 277
  • [23] The Cox Proportional Hazards Model with a Partly Linear Relative Risk Function
    Glenn Heller
    Lifetime Data Analysis, 2001, 7 : 255 - 277
  • [24] Attenuation in risk estimates in logistic and cox proportional-hazards models due to group-based exposure assessment strategy
    Kim, Hyang-Mi
    Yasui, Yutaka
    Burstyn, Igor
    ANNALS OF OCCUPATIONAL HYGIENE, 2006, 50 (06): : 623 - 635
  • [25] Comparison of Cox proportional hazards regression and generalized Cox regression models applied in dementia risk prediction
    Goerdten, Jantje
    Carriere, Isabelle
    Muniz-Terrera, Graciela
    ALZHEIMERS & DEMENTIA-TRANSLATIONAL RESEARCH & CLINICAL INTERVENTIONS, 2020, 6 (01)
  • [26] Influence diagnostics for the Cox proportional hazards regression model: method, simulation and applications
    Kausar, Tehzeeb
    Akbar, Atif
    Qasim, Muhammad
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2023, 93 (10) : 1580 - 1600
  • [27] Predictive accuracy of novel risk factors and markers: A simulation study of the sensitivity of different performance measures for the Cox proportional hazards regression model
    Austin, Peter C.
    Pencinca, Michael J.
    Steyerberg, Ewout W.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2017, 26 (03) : 1053 - 1077
  • [28] Cox's proportional hazards model with Lp penalty for biomarker identification and survival prediction
    Liu, Zhenqiu
    ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 624 - 628
  • [29] NOVEL HEAD AND NECK CANCER SURVIVAL ANALYSIS APPROACH: RANDOM SURVIVAL FORESTS VERSUS COX PROPORTIONAL HAZARDS REGRESSION
    Datema, Frank R.
    Moya, Ana
    Krause, Peter
    Baeck, Thomas
    Willmes, Lars
    Langeveld, Ton
    de Jong, Robert J. Baatenburg
    Blom, Henk M.
    HEAD AND NECK-JOURNAL FOR THE SCIENCES AND SPECIALTIES OF THE HEAD AND NECK, 2012, 34 (01): : 50 - 58
  • [30] Use of Schoenfeld's global test to test the proportional hazards assumption in the Cox proportional hazards model: an application to a clinical study
    Abeysekera, W. W. M.
    Sooriyarachchi, M. R.
    JOURNAL OF THE NATIONAL SCIENCE FOUNDATION OF SRI LANKA, 2009, 37 (01): : 41 - 51