Gradient boosted trees for evolving data streams

被引：0

作者：

Nuwan Gunasekara

Bernhard Pfahringer

Heitor Gomes

Albert Bifet

机构：

[1] University of Waikato,AI Institute

[2] Victoria University of Wellington,LTCI, Télécom Paris

[3] IP Paris,undefined

来源：

Machine Learning | 2024年 / 113卷

关键词：

Gradient boosting; Stream learning; Gradient boosted trees; Concept drift;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Gradient Boosting is a widely-used machine learning technique that has proven highly effective in batch learning. However, its effectiveness in stream learning contexts lags behind bagging-based ensemble methods, which currently dominate the field. One reason for this discrepancy is the challenge of adapting the booster to new concept following a concept drift. Resetting the entire booster can lead to significant performance degradation as it struggles to learn the new concept. Resetting only some parts of the booster can be more effective, but identifying which parts to reset is difficult, given that each boosting step builds on the previous prediction. To overcome these difficulties, we propose Streaming Gradient Boosted Trees (Sgbt), which is trained using weighted squared loss elicited in XGBoost. Sgbt exploits trees with a replacement strategy to detect and recover from drifts, thus enabling the ensemble to adapt without sacrificing the predictive performance. Our empirical evaluation of Sgbt on a range of streaming datasets with challenging drift scenarios demonstrates that it outperforms current state-of-the-art methods for evolving data streams.

引用

页码：3325 / 3352

页数：27

共 50 条

[21] Syntax Description Synthesis Using Gradient Boosted Trees
Astashkin, Arseny
Chuvilin, Kirill
PROCEEDINGS OF THE 20TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT 2017), 2017, : 32 - 39
[22] Adversarial Training of Gradient-Boosted Decision Trees
Calzavara, Stefano
Lucchese, Claudio
Tolomei, Gabriele
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2429 - 2432
[23] Gradient boosted decision trees for combustion chemistry integration
Yao, S.
Kronenburg, A.
Shamooni, A.
Stein, O. T.
Zhang, W.
APPLICATIONS IN ENERGY AND COMBUSTION SCIENCE, 2022, 11
[24] Scalable Feature Selection for (Multitask) Gradient Boosted Trees
Han, Cuize
Rao, Nikhil
Sorokina, Daria
Subbian, Karthik
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 885 - 893
[25] UTBoost: Gradient Boosted Decision Trees for Uplift Modeling
Gao, Junjie
Zheng, Xiangyu
Wang, DongDong
Huang, Zhixiang
Zheng, Bangqi
PRICAI 2024: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2025, 15281 : 41 - 53
[26] Boosted multivariate trees for longitudinal data
Amol Pande
Liang Li
Jeevanantham Rajeswaran
John Ehrlinger
Udaya B. Kogalur
Eugene H. Blackstone
Hemant Ishwaran
Machine Learning, 2017, 106 : 277 - 305
[27] Boosted multivariate trees for longitudinal data
Pande, Amol
Li, Liang
Rajeswaran, Jeevanantham
Ehrlinger, John
Kogalur, Udaya B.
Blackstone, Eugene H.
Ishwaran, Hemant
MACHINE LEARNING, 2017, 106 (02) : 277 - 305
[28] Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams
Marron, Diego
Bifet, Albert
Morales, Gianmarco De Francisci
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 615 - +
[29] On change diagnosis in evolving data streams
Aggarwal, CC
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 587 - 600
[30] Leveraging Bagging for Evolving Data Streams
Bifet, Albert
Holmes, Geoff
Pfahringer, Bernhard
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I: EUROPEAN CONFERENCE, ECML PKDD 2010, 2010, 6321 : 135 - 150

← 1 2 3 4 5 →