Tree aggregation for random forest class probability estimation

被引:23
|
作者
Sage, Andrew J. [1 ]
Genschel, Ulrike [2 ]
Nettleton, Dan [2 ]
机构
[1] Lawrence Univ, Dept Math & Comp Sci, Appleton, WI 54912 USA
[2] Iowa State Univ, Dept Stat, Ames, IA USA
关键词
aggregation; class probability estimation; random forest; REGRESSION; ERROR;
D O I
10.1002/sam.11446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In random forest methodology, an overall prediction or estimate is made by aggregating predictions made by individual decision trees. Popular implementations of random forests rely on different methods for aggregating predictions. In this study, we provide an empirical analysis of the performance of aggregation approaches available for classification and regression problems. We show that while the choice of aggregation scheme usually has little impact in regression, it can have a profound effect on probability estimation in classification problems. Our study illustrates the causes of calibration issues that arise from two popular aggregation approaches and highlights the important role that terminal nodesize plays in the aggregation of tree predictions. We show that optimal choices for random forest tuning parameters depend heavily on the manner in which tree predictions are aggregated.
引用
收藏
页码:134 / 150
页数:17
相关论文
共 50 条
  • [21] PROBABILITY OF COMPATIBILITY OF A CLASS OF RANDOM LOGICAL EQUATIONS
    KOVALENKO, IM
    DOPOVIDI AKADEMII NAUK UKRAINSKOI RSR SERIYA A-FIZIKO-MATEMATICHNI TA TECHNICHNI NAUKI, 1976, (08): : 681 - 684
  • [22] The maximum tree of a random forest in the configuration graph
    Pavlov, Yu L.
    SBORNIK MATHEMATICS, 2021, 212 (09) : 1329 - 1346
  • [23] A Detailed Review on Decision Tree and Random Forest
    Talekar, Bhusban
    Agrawal, Sachin
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 245 - 248
  • [24] Unsupervised random forest for affinity estimation
    Yunai Yi
    Diya Sun
    Peixin Li
    Tae-Kyun Kim
    Tianmin Xu
    Yuru Pei
    Computational Visual Media, 2022, 8 (02) : 257 - 272
  • [25] Unsupervised random forest for affinity estimation
    Yunai Yi
    Diya Sun
    Peixin Li
    Tae-Kyun Kim
    Tianmin Xu
    Yuru Pei
    Computational Visual Media, 2022, 8 : 257 - 272
  • [26] Unsupervised random forest for affinity estimation
    Yi, Yunai
    Sun, Diya
    Li, Peixin
    Kim, Tae-Kyun
    Xu, Tianmin
    Pei, Yuru
    COMPUTATIONAL VISUAL MEDIA, 2022, 8 (02) : 257 - 272
  • [27] ESTIMATION OF DENSITY OF THE CONTINUOUS PART OF A RANDOM PROBABILITY
    AKONOM, J
    COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES SERIE A, 1978, 287 (14): : 977 - 980
  • [28] On the optimality of probability estimation by random decision trees
    Fan, W
    PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 336 - 341
  • [29] Random variation and systematic biases in probability estimation
    Howe, Rita
    Costello, Fintan
    COGNITIVE PSYCHOLOGY, 2020, 123
  • [30] Estimation of the truncation probability in the random truncation model
    He, SY
    Yang, GL
    ANNALS OF STATISTICS, 1998, 26 (03): : 1011 - 1027