Robustness of random forests for regression

被引:64
|
作者
Roy, Marie-Helene [1 ]
Larocque, Denis [1 ]
机构
[1] HEC Montreal, Dept Management Sci, Montreal, PQ H3T 2A7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
random forest; quantile regression forest; robustness; median; ranks; least-absolute deviations;
D O I
10.1080/10485252.2012.715161
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we empirically investigate the robustness of random forests for regression problems. We also investigate the performance of six variations of the original random forest method, all aimed at improving robustness. These variations are based on three main ideas: (1) robustify the aggregation method, (2) robustify the splitting criterion and (3) taking a robust transformation of the response. More precisely, with the first idea, we use the median (or weighted median), instead of the mean, to combine the predictions from the individual trees. With the second idea, we use least-absolute deviations from the median, instead of least-squares, as splitting criterion. With the third idea, we build the trees using the ranks of the response instead of the original values. The competing methods are compared via a simulation study with artificial data using two different types of contaminations and also with 13 real data sets. Our results show that all three ideas improve the robustness of the original random forest algorithm. However, a robust aggregation of the individual trees is generally more profitable than a robust splitting criterion.
引用
收藏
页码:993 / 1006
页数:14
相关论文
共 50 条
  • [1] Covariance regression with random forests
    Cansu Alakus
    Denis Larocque
    Aurélie Labbe
    [J]. BMC Bioinformatics, 24
  • [2] Covariance regression with random forests
    Alakus, Cansu
    Larocque, Denis
    Labbe, Aurelie
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [3] Regression conformal prediction with random forests
    Johansson, Ulf
    Bostrom, Henrik
    Lofstrom, Tuve
    Linusson, Henrik
    [J]. MACHINE LEARNING, 2014, 97 (1-2) : 155 - 176
  • [4] Online random forests regression with memories
    Zhong, Yuan
    Yang, Hongyu
    Zhang, Yanci
    Li, Ping
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 201
  • [5] Regression conformal prediction with random forests
    Ulf Johansson
    Henrik Boström
    Tuve Löfström
    Henrik Linusson
    [J]. Machine Learning, 2014, 97 : 155 - 176
  • [6] Online Rebuilding Regression Random Forests
    Zhong, Yuan
    Yang, Hongyu
    Zhang, Yanci
    Li, Ping
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 221
  • [7] Random forests regression for soft interval data
    Gaona-Partida, Paul
    Yeh, Chih-Ching
    Sun, Yan
    Cutler, Adele
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [8] Bias-corrected random forests in regression
    Zhang, Guoyi
    Lu, Yan
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (01) : 151 - 160
  • [9] Target Aggregation Regression based on Random Forests
    Meng, Fan
    Tan, Yue
    Bu, Yi
    [J]. 8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2020 & 2021): DEVELOPING GLOBAL DIGITAL ECONOMY AFTER COVID-19, 2022, 199 : 517 - 523
  • [10] Model-based random forests for ordinal regression
    Buri, Muriel
    Hothorn, Torsten
    [J]. INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2020, 16 (02):