Robustness of random forests for regression

被引:63
|
作者
Roy, Marie-Helene [1 ]
Larocque, Denis [1 ]
机构
[1] HEC Montreal, Dept Management Sci, Montreal, PQ H3T 2A7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
random forest; quantile regression forest; robustness; median; ranks; least-absolute deviations;
D O I
10.1080/10485252.2012.715161
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we empirically investigate the robustness of random forests for regression problems. We also investigate the performance of six variations of the original random forest method, all aimed at improving robustness. These variations are based on three main ideas: (1) robustify the aggregation method, (2) robustify the splitting criterion and (3) taking a robust transformation of the response. More precisely, with the first idea, we use the median (or weighted median), instead of the mean, to combine the predictions from the individual trees. With the second idea, we use least-absolute deviations from the median, instead of least-squares, as splitting criterion. With the third idea, we build the trees using the ranks of the response instead of the original values. The competing methods are compared via a simulation study with artificial data using two different types of contaminations and also with 13 real data sets. Our results show that all three ideas improve the robustness of the original random forest algorithm. However, a robust aggregation of the individual trees is generally more profitable than a robust splitting criterion.
引用
下载
收藏
页码:993 / 1006
页数:14
相关论文
共 50 条
  • [1] Covariance regression with random forests
    Cansu Alakus
    Denis Larocque
    Aurélie Labbe
    BMC Bioinformatics, 24
  • [2] Covariance regression with random forests
    Alakus, Cansu
    Larocque, Denis
    Labbe, Aurelie
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [3] Regression conformal prediction with random forests
    Johansson, Ulf
    Bostrom, Henrik
    Lofstrom, Tuve
    Linusson, Henrik
    MACHINE LEARNING, 2014, 97 (1-2) : 155 - 176
  • [4] Online random forests regression with memories
    Zhong, Yuan
    Yang, Hongyu
    Zhang, Yanci
    Li, Ping
    KNOWLEDGE-BASED SYSTEMS, 2020, 201
  • [5] Online Rebuilding Regression Random Forests
    Zhong, Yuan
    Yang, Hongyu
    Zhang, Yanci
    Li, Ping
    KNOWLEDGE-BASED SYSTEMS, 2021, 221
  • [6] Regression conformal prediction with random forests
    Ulf Johansson
    Henrik Boström
    Tuve Löfström
    Henrik Linusson
    Machine Learning, 2014, 97 : 155 - 176
  • [7] Random forests regression for soft interval data
    Gaona-Partida, Paul
    Yeh, Chih-Ching
    Sun, Yan
    Cutler, Adele
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [8] Target Aggregation Regression based on Random Forests
    Meng, Fan
    Tan, Yue
    Bu, Yi
    8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2020 & 2021): DEVELOPING GLOBAL DIGITAL ECONOMY AFTER COVID-19, 2022, 199 : 517 - 523
  • [9] Bias-corrected random forests in regression
    Zhang, Guoyi
    Lu, Yan
    JOURNAL OF APPLIED STATISTICS, 2012, 39 (01) : 151 - 160
  • [10] Pathway analysis using random forests classification and regression
    Pang, Herbert
    Lin, Aiping
    Holford, Matthew
    Enerson, Bradley E.
    Lu, Bin
    Lawton, Michael P.
    Floyd, Eugenia
    Zhao, Hongyu
    BIOINFORMATICS, 2006, 22 (16) : 2028 - 2036