RANDOMIZED GRADIENT BOOSTING MACHINE

被引:14
|
作者
Lu, Haihao [1 ]
Mazumder, Rahul [2 ,3 ]
机构
[1] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
[2] MIT, Sloan Sch Management, Ctr Operat Res, Cambridge, MA 02142 USA
[3] MIT, Ctr Stat, Cambridge, MA 02142 USA
关键词
gradient boosting; ensemble methods; convex optimization; coordinate descent; computational guarantees; first order methods; LOGISTIC-REGRESSION; CONDITION NUMBER; CONVERGENCE; OPTIMIZATION;
D O I
10.1137/18M1223277
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The Gradient Boosting Machine (GBM) introduced by Friedman [J. H. Friedman, Ann. Statist., 29 (2001), pp. 1189-1232] is a powerful supervised learning algorithm that is very widely used in practice-it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In spite of the usefulness of GBM in practice, our current theoretical understanding of this method is rather limited. In this work, we propose the Randomized Gradient Boosting Machine (RGBM), which leads to substantial computational gains compared to GBM by using a randomization scheme to reduce search in the space of weak learners. We derive novel computational guarantees for RGBM. We also provide a principled guideline towards better step-size selection in RGBM that does not require a line search. Our proposed framework is inspired by a special variant of coordinate descent that combines the benefits of randomized coordinate descent and greedy coordinate descent, and may be of independent interest as an optimization algorithm. As a special case, our results for RGBM lead to superior computational guarantees for GBM. Our computational guarantees depend upon a curious geometric quantity that we call the Minimal Cosine Angle, which relates to the density of weak learners in the prediction space. On a series of numerical experiments on real datasets, we demonstrate the effectiveness of RGBM over GBM in terms of obtaining a model with good training and/or testing data fidelity with a fraction of the computational cost.
引用
下载
收藏
页码:2780 / 2808
页数:29
相关论文
共 50 条
  • [41] TRBoost: a generic gradient boosting machine based on trust-region method
    Luo, Jiaqi
    Wei, Zihao
    Man, Junkai
    Xu, Shixin
    APPLIED INTELLIGENCE, 2023, 53 (22) : 27876 - 27891
  • [42] EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction
    Xing Chen
    Li Huang
    Di Xie
    Qi Zhao
    Cell Death & Disease, 9
  • [43] Hybrid classification of Android malware based on fuzzy clustering and the gradient boosting machine
    Taha, Altyeb Altaher
    Malebary, Sharaf Jameel
    Neural Computing and Applications, 2021, 33 (12) : 6721 - 6732
  • [44] EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction
    Chen, Xing
    Huang, Li
    Xie, Di
    Zhao, Qi
    CELL DEATH & DISEASE, 2018, 9
  • [45] An efficient churn prediction model using gradient boosting machine and metaheuristic optimization
    Alshourbaji, Ibrahim
    Helian, Na
    Sun, Yi
    Hussien, Abdelazim G.
    Abualigah, Laith
    Elnaim, Bushra
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [46] An efficient churn prediction model using gradient boosting machine and metaheuristic optimization
    Ibrahim AlShourbaji
    Na Helian
    Yi Sun
    Abdelazim G. Hussien
    Laith Abualigah
    Bushra Elnaim
    Scientific Reports, 13
  • [47] Gradient Boosting Machine and Object-Based CNN for Land Cover Classification
    Bui, Quang-Thanh
    Chou, Tien-Yin
    Hoang, Thanh-Van
    Fang, Yao-Min
    Mu, Ching-Yun
    Huang, Pi-Hui
    Pham, Vu-Dong
    Nguyen, Quoc-Huy
    Do Thi Ngoc Anh
    Pham, Van-Manh
    Meadows, Michael E.
    REMOTE SENSING, 2021, 13 (14)
  • [48] Gradient Boosting Machine to Assess the Public Protest Impact on Urban Air Quality
    Zalakeviciute, Rasa
    Rybarczyk, Yves
    Alexandrino, Katiuska
    Bonilla-Bedoya, Santiago
    Mejia, Danilo
    Bastidas, Marco
    Diaz, Valeria
    APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [49] Extreme Gradient Boosting Machine Learning Algorithm For Safe Auto Insurance Operations
    Dhieb, Najmeddine
    Ghazzai, Hakim
    Besbes, Hichem
    Massoud, Yehia
    2019 IEEE INTERNATIONAL CONFERENCE OF VEHICULAR ELECTRONICS AND SAFETY (ICVES 19), 2019,
  • [50] Development of a predictive emissions model using a gradient boosting machine learning method
    Si, Minxing
    Du, Ke
    ENVIRONMENTAL TECHNOLOGY & INNOVATION, 2020, 20