Gradient boosted trees for evolving data streams

被引:0
|
作者
Nuwan Gunasekara
Bernhard Pfahringer
Heitor Gomes
Albert Bifet
机构
[1] University of Waikato,AI Institute
[2] Victoria University of Wellington,LTCI, Télécom Paris
[3] IP Paris,undefined
来源
Machine Learning | 2024年 / 113卷
关键词
Gradient boosting; Stream learning; Gradient boosted trees; Concept drift;
D O I
暂无
中图分类号
学科分类号
摘要
Gradient Boosting is a widely-used machine learning technique that has proven highly effective in batch learning. However, its effectiveness in stream learning contexts lags behind bagging-based ensemble methods, which currently dominate the field. One reason for this discrepancy is the challenge of adapting the booster to new concept following a concept drift. Resetting the entire booster can lead to significant performance degradation as it struggles to learn the new concept. Resetting only some parts of the booster can be more effective, but identifying which parts to reset is difficult, given that each boosting step builds on the previous prediction. To overcome these difficulties, we propose Streaming Gradient Boosted Trees (Sgbt), which is trained using weighted squared loss elicited in XGBoost. Sgbt exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance. Our empirical evaluation of Sgbt on a range of streaming datasets with challenging drift scenarios demonstrates that it outperforms current state-of-the-art methods for evolving data streams.
引用
收藏
页码:3325 / 3352
页数:27
相关论文
共 50 条
  • [21] Syntax Description Synthesis Using Gradient Boosted Trees
    Astashkin, Arseny
    Chuvilin, Kirill
    PROCEEDINGS OF THE 20TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT 2017), 2017, : 32 - 39
  • [22] Adversarial Training of Gradient-Boosted Decision Trees
    Calzavara, Stefano
    Lucchese, Claudio
    Tolomei, Gabriele
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2429 - 2432
  • [23] Gradient boosted decision trees for combustion chemistry integration
    Yao, S.
    Kronenburg, A.
    Shamooni, A.
    Stein, O. T.
    Zhang, W.
    APPLICATIONS IN ENERGY AND COMBUSTION SCIENCE, 2022, 11
  • [24] Scalable Feature Selection for (Multitask) Gradient Boosted Trees
    Han, Cuize
    Rao, Nikhil
    Sorokina, Daria
    Subbian, Karthik
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 885 - 893
  • [25] UTBoost: Gradient Boosted Decision Trees for Uplift Modeling
    Gao, Junjie
    Zheng, Xiangyu
    Wang, DongDong
    Huang, Zhixiang
    Zheng, Bangqi
    PRICAI 2024: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2025, 15281 : 41 - 53
  • [26] Boosted multivariate trees for longitudinal data
    Amol Pande
    Liang Li
    Jeevanantham Rajeswaran
    John Ehrlinger
    Udaya B. Kogalur
    Eugene H. Blackstone
    Hemant Ishwaran
    Machine Learning, 2017, 106 : 277 - 305
  • [27] Boosted multivariate trees for longitudinal data
    Pande, Amol
    Li, Liang
    Rajeswaran, Jeevanantham
    Ehrlinger, John
    Kogalur, Udaya B.
    Blackstone, Eugene H.
    Ishwaran, Hemant
    MACHINE LEARNING, 2017, 106 (02) : 277 - 305
  • [28] Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams
    Marron, Diego
    Bifet, Albert
    Morales, Gianmarco De Francisci
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 615 - +
  • [29] On change diagnosis in evolving data streams
    Aggarwal, CC
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 587 - 600
  • [30] Leveraging Bagging for Evolving Data Streams
    Bifet, Albert
    Holmes, Geoff
    Pfahringer, Bernhard
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I: EUROPEAN CONFERENCE, ECML PKDD 2010, 2010, 6321 : 135 - 150