Explaining the success of adaboost and random forests as interpolating classifiers

被引:0
|
作者
Wyner, Abraham J. [1 ]
Olson, Matthew [1 ]
Bleich, Justin [1 ]
Mease, David [2 ]
机构
[1] Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia,PA,19104, United States
[2] Apple Inc., United States
关键词
D O I
暂无
中图分类号
学科分类号
摘要
There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature. We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a selfaveraging, interpolating algorithm which creates what we denote as a spiked-smooth classifier, and we view AdaBoost in the same light. We conjecture that both AdaBoost and random forests succeed because of this mechanism. We provide a number of examples to support this explanation. In the process, we question the conventional wisdom that suggests that boosting algorithms for classification require regularization or early stopping and should be limited to low complexity classes of learners, such as decision stumps. We conclude that boosting should be used like random forests: with large decision trees, without regularization or early stopping. © 2017 Abraham J. Wyner, Matthew Olson, Justin Bleich, and David Mease.
引用
收藏
页码:1 / 33
相关论文
共 50 条
  • [1] Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers
    Wyner, Abraham J.
    Olson, Matthew
    Bleich, Justin
    Mease, David
    JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 : 1 - 33
  • [2] On Explaining Random Forests with SAT
    Izza, Yacine
    Marques-Silva, Joao
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2584 - 2591
  • [3] An Improvement of AdaBoost for Face Detection with Random Forests
    Zeng, Jun-Ying
    Cao, Xiao-Hua
    Gan, Jun-Ying
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 22 - 29
  • [4] Are Random Forests Truly the Best Classifiers?
    Wainberg, Michael
    Alipanahi, Babak
    Frey, Brendan J.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [5] Combining Heritance AdaBoost and Random Forests for Face Detection
    Gan, Jun-Ying
    Cao, Xiao-Hua
    Zeng, Jun-Ying
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 666 - 669
  • [6] Consistency of random forests and other averaging classifiers
    Biau, Gérard
    Devroye, Luc
    Lugosi, Gábor
    Journal of Machine Learning Research, 2008, 9 : 2015 - 2033
  • [7] Consistency of Random Forests and Other Averaging Classifiers
    Biau, Gerard
    Devroye, Luc
    Lugosi, Gabor
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 2015 - 2033
  • [8] Explaining Cautious Random Forests via Counterfactuals
    Zhang, Haifei
    Quost, Benjamin
    Masson, Marie-Helene
    BUILDING BRIDGES BETWEEN SOFT AND STATISTICAL METHODOLOGIES FOR DATA SCIENCE, 2023, 1433 : 390 - 397
  • [9] AdaBoost Algorithm with Random Forests for Predicting Breast Cancer Survivability
    Thongkam, Jaree
    Xu, Guandong
    Zhang, Yanchun
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3062 - 3069
  • [10] Combining Bootstrapping Samples, Random Subspaces and Random Forests to Build Classifiers
    Daho, Mostafa El Habib
    Chikh, Mohammed Amine
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2015, 5 (03) : 539 - 544