Gradient boosted trees for evolving data streams

被引:0
|
作者
Nuwan Gunasekara
Bernhard Pfahringer
Heitor Gomes
Albert Bifet
机构
[1] University of Waikato,AI Institute
[2] Victoria University of Wellington,LTCI, Télécom Paris
[3] IP Paris,undefined
来源
Machine Learning | 2024年 / 113卷
关键词
Gradient boosting; Stream learning; Gradient boosted trees; Concept drift;
D O I
暂无
中图分类号
学科分类号
摘要
Gradient Boosting is a widely-used machine learning technique that has proven highly effective in batch learning. However, its effectiveness in stream learning contexts lags behind bagging-based ensemble methods, which currently dominate the field. One reason for this discrepancy is the challenge of adapting the booster to new concept following a concept drift. Resetting the entire booster can lead to significant performance degradation as it struggles to learn the new concept. Resetting only some parts of the booster can be more effective, but identifying which parts to reset is difficult, given that each boosting step builds on the previous prediction. To overcome these difficulties, we propose Streaming Gradient Boosted Trees (Sgbt), which is trained using weighted squared loss elicited in XGBoost. Sgbt exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance. Our empirical evaluation of Sgbt on a range of streaming datasets with challenging drift scenarios demonstrates that it outperforms current state-of-the-art methods for evolving data streams.
引用
收藏
页码:3325 / 3352
页数:27
相关论文
共 50 条
  • [1] Gradient boosted trees for evolving data streams
    Gunasekara, Nuwan
    Pfahringer, Bernhard
    Gomes, Heitor
    Bifet, Albert
    MACHINE LEARNING, 2024, 113 (05) : 3325 - 3352
  • [2] Learning model trees from evolving data streams
    Ikonomovska, Elena
    Gama, Joao
    Dzeroski, Saso
    DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 23 (01) : 128 - 168
  • [3] Mining frequent closed trees in evolving data streams
    Bifet, Albert
    Gavalda, Ricard
    INTELLIGENT DATA ANALYSIS, 2011, 15 (01) : 29 - 48
  • [4] Learning model trees from evolving data streams
    Elena Ikonomovska
    João Gama
    Sašo Džeroski
    Data Mining and Knowledge Discovery, 2011, 23 : 128 - 168
  • [5] Boost-R: Gradient boosted trees for recurrence data
    Liu, Xiao
    Pan, Rong
    JOURNAL OF QUALITY TECHNOLOGY, 2021, 53 (05) : 545 - 565
  • [6] Evolving fuzzy pattern trees for binary classification on data streams
    Shaker, Ammar
    Senge, Robin
    Huellermeier, Eyke
    INFORMATION SCIENCES, 2013, 220 : 34 - 45
  • [7] Summarizing evolving data streams using dynamic prefix trees
    Rojas, Carlos
    Nasraoui, Olfa
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 221 - 227
  • [8] Gradient boosted trees for spatial data and its application to medical imaging data
    Iranzad, Reza
    Liu, Xiao
    Chaovalitwongse, W. Art
    Hippe, Daniel
    Wang, Shouyi
    Han, Jie
    Thammasorn, Phawis
    Zeng, Jing
    Duan, Chunyan
    Bowen, Stephen
    IISE TRANSACTIONS ON HEALTHCARE SYSTEMS ENGINEERING, 2022, 12 (03) : 165 - 179
  • [9] Gradient Boosted Trees for Corrective Learning
    Oguz, Baris U.
    Shinohara, Russell T.
    Yushkevich, Paul A.
    Oguz, Ipek
    MACHINE LEARNING IN MEDICAL IMAGING (MLMI 2017), 2017, 10541 : 203 - 211
  • [10] Learning to predict soccer results from relational data with gradient boosted trees
    Hubacek, Ondrej
    Sourek, Gustav
    Zelezny, Filip
    MACHINE LEARNING, 2019, 108 (01) : 29 - 47