Tree aggregation for random forest class probability estimation

被引:23
|
作者
Sage, Andrew J. [1 ]
Genschel, Ulrike [2 ]
Nettleton, Dan [2 ]
机构
[1] Lawrence Univ, Dept Math & Comp Sci, Appleton, WI 54912 USA
[2] Iowa State Univ, Dept Stat, Ames, IA USA
关键词
aggregation; class probability estimation; random forest; REGRESSION; ERROR;
D O I
10.1002/sam.11446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In random forest methodology, an overall prediction or estimate is made by aggregating predictions made by individual decision trees. Popular implementations of random forests rely on different methods for aggregating predictions. In this study, we provide an empirical analysis of the performance of aggregation approaches available for classification and regression problems. We show that while the choice of aggregation scheme usually has little impact in regression, it can have a profound effect on probability estimation in classification problems. Our study illustrates the causes of calibration issues that arise from two popular aggregation approaches and highlights the important role that terminal nodesize plays in the aggregation of tree predictions. We show that optimal choices for random forest tuning parameters depend heavily on the manner in which tree predictions are aggregated.
引用
收藏
页码:134 / 150
页数:17
相关论文
共 50 条
  • [31] Probability of NPA for NBFC - An Application of Random Forest Model
    Vishweswarsastry, V. N.
    Vittala, K. R. Pundareeka
    PACIFIC BUSINESS REVIEW INTERNATIONAL, 2022, 15 (06): : 24 - 36
  • [32] On random hyper-class random forest for visual classification
    Li, Teng
    Ni, Bingbing
    Wu, Xinyu
    Gao, Qingwei
    Li, Qianmu
    Sun, Dong
    NEUROCOMPUTING, 2016, 172 : 281 - 289
  • [33] A Class of Random Recursive Tree Algorithms with Deletion
    Saunders, Arnold T., Jr.
    ALGORITHMICA, 2021, 83 (11) : 3363 - 3378
  • [34] A Class of Random Recursive Tree Algorithms with Deletion
    Arnold T. Saunders
    Algorithmica, 2021, 83 : 3363 - 3378
  • [35] Reducing False Arrhythmia Alarms Using Different Methods of Probability and Class Assignment in Random Forest Learning Methods
    Gajowniczek, Krzysztof
    Grzegorczyk, Iga
    Zabkowski, Tomasz
    SENSORS, 2019, 19 (07)
  • [36] Evaluation of random forest and regression tree methods for estimation of mass first flush ratio in urban catchments
    Jeung, Minhyuk
    Baek, Sangsoo
    Beom, Jina
    Cho, Kyung Hwa
    Her, Younggu
    Yoon, Kwangsik
    JOURNAL OF HYDROLOGY, 2019, 575 : 1099 - 1110
  • [37] Random forest of Classification and Regression Tree (CART) in the estimation of SWC based on meteorological inputs and hydrodynamics behind
    Wu, Tsung-Hsi
    Chen, Pei-Yuan
    Chen, Chien-Chih
    Chung, Meng-Ju
    Ye, Zheng-Kai
    Li, Ming-Hsu
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2251 - 2255
  • [38] Active Sampling for Class Probability Estimation and Ranking
    Maytal Saar-Tsechansky
    Foster Provost
    Machine Learning, 2004, 54 : 153 - 178
  • [39] Conditional Density Estimation with Class Probability Estimators
    Frank, Eibe
    Bouckaert, Remco R.
    ADVANCES IN MACHINE LEARNING, PROCEEDINGS, 2009, 5828 : 65 - 81
  • [40] Active sampling for class probability estimation and ranking
    Saar-Tsechansky, M
    Provost, F
    MACHINE LEARNING, 2004, 54 (02) : 153 - 178