Kappa Updated Ensemble for drifting data stream mining

被引:114
|
作者
Cano, Alberto [1 ]
Krawczyk, Bartosz [1 ]
机构
[1] Virginia Commonwealth Univ, 401 W Main St,E4251, Richmond, VA 23284 USA
关键词
Machine learning; Data streams; Concept drift; Classification; Ensemble learning; DYNAMIC WEIGHTED MAJORITY; ABSTAINING CLASSIFIERS; FEATURE-SELECTION; CLASSIFICATION; ADAPTATION;
D O I
10.1007/s10994-019-05840-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from data streams in the presence of concept drift is among the biggest challenges of contemporary machine learning. Algorithms designed for such scenarios must take into an account the potentially unbounded size of data, its constantly changing nature, and the requirement for real-time processing. Ensemble approaches for data stream mining have gained significant popularity, due to their high predictive capabilities and effective mechanisms for alleviating concept drift. In this paper, we propose a new ensemble method named Kappa Updated Ensemble (KUE). It is a combination of online and block-based ensemble approaches that uses Kappa statistic for dynamic weighting and selection of base classifiers. In order to achieve a higher diversity among base learners, each of them is trained using a different subset of features and updated with new instances with given probability following a Poisson distribution. Furthermore, we update the ensemble with new classifiers only when they contribute positively to the improvement of the quality of the ensemble. Finally, each base classifier in KUE is capable of abstaining itself for taking a part in voting, thus increasing the overall robustness of KUE. An extensive experimental study shows that KUE is capable of outperforming state-of-the-art ensembles on standard and imbalanced drifting data streams while having a low computational complexity. Moreover, we analyze the use of Kappa versus accuracy to drive the criterion to select and update the classifiers, the contribution of the abstaining mechanism, the contribution of the diversification of classifiers, and the contribution of the hybrid architecture to update the classifiers in an online manner.
引用
收藏
页码:175 / 218
页数:44
相关论文
共 50 条
  • [1] Kappa Updated Ensemble for drifting data stream mining
    Alberto Cano
    Bartosz Krawczyk
    [J]. Machine Learning, 2020, 109 : 175 - 218
  • [2] Adaptive Ensemble Active Learning for Drifting Data Stream Mining
    Krawczyk, Bartosz
    Cano, Alberto
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2763 - 2771
  • [3] ADAW: Age decay accuracy weighted ensemble method for drifting data stream mining
    Srivastava, Ritesh
    Mittal, Veena
    [J]. INTELLIGENT DATA ANALYSIS, 2021, 25 (05) : 1131 - 1152
  • [4] KAPPA as Drift Detector in Data Stream Mining
    Mahdi, Osama A.
    Pardede, Eric
    Ali, Nawfal
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 314 - 321
  • [5] An adaptive ensemble classifier for mining concept drifting data streams
    Farid, Dewan Md.
    Zhang, Li
    Hossain, Alamgir
    Rahman, Chowdhury Mofizur
    Strachan, Rebecca
    Sexton, Graham
    Dahal, Keshav
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (15) : 5895 - 5906
  • [6] An Aggregate Ensemble for Mining Concept Drifting Data Streams with Noise
    Zhang, Peng
    Zhu, Xingquan
    Shi, Yong
    Wu, Xindong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, 5476 : 1021 - +
  • [7] Pyramid Stack Data Stream Mining for Handling Concept-drifting
    Xu, Zhuoran
    Hou, Cuiqin
    Xia, Yingju
    Sun, Jun
    Inakoshi, Hiroya
    Yugami, Nobuhiro
    [J]. PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 33 - 37
  • [8] How to adjust an ensemble size in stream data mining?
    Pietruczuk, Lena
    Rutkowski, Leszek
    Jaworski, Maciej
    Duda, Piotr
    [J]. INFORMATION SCIENCES, 2017, 381 : 46 - 54
  • [9] AN ADAPTIVE ENSEMBLE CLASSIFIER FOR CONCEPT DRIFTING STREAM
    Wui, Dengyuan
    Liu, Ying
    Gao, Ge
    Mao, Zhendong
    Ma, Weishan
    He, Tao
    [J]. 2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, : 69 - 75
  • [10] Mining Concept-Drifting and Noisy Data Streams using Ensemble Classifiers
    Ouyang, Zhenzheng
    Zhou, Min
    Wang, Tao
    Wu, Quanyuan
    [J]. 2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, VOL IV, PROCEEDINGS, 2009, : 360 - +