The Gradual Resampling Ensemble for mining imbalanced data streams with concept drift

被引:38
|
作者
Ren, Siqi [1 ]
Liao, Bo [1 ]
Zhu, Wen [1 ]
Li, Zeng [2 ]
Liu, Wei [1 ]
Li, Keqin [1 ,3 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国国家自然科学基金;
关键词
Concept drift; Data stream mining; Ensemble classifier; Class imbalance; CLASSIFIERS;
D O I
10.1016/j.neucom.2018.01.063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge extraction from data streams has received increasing interest in recent years. However, most of the existing studies assume that the class distribution of data streams is relatively balanced. The reaction of concept drifts is more difficult if a data stream is class imbalanced. Current oversampling methods generally selectively absorb the previously received minority examples into the current minority set by evaluating similarities of past minority examples and the current minority set. However, the similarity evaluation is easily affected by data difficulty factors. Meanwhile, these oversampling techniques have ignored the majority class distribution, thus risking class overlapping. To overcome these issues, we propose an ensemble classifier called Gradual Resampling Ensemble (GRE). GRE could handle data streams which exhibit concept drifts and class imbalance. On the one hand, a selectively resampling method, where drifting data can be avoidable, is applied to select a part of previous minority examples for amplifying the current minority set. The disjuncts can be discovered by the DBSCAN clustering, and thus the influences of small disjuncts and outliers on the similarity evaluation can be avoidable. Only those minority examples with low probability of overlapping with the current majority set can be selected for resampling the current minority set. On the other hand, previous component classifiers are updated using latest instances. Thus, the ensemble could quickly adapt to a new condition, regardless types of concept drifts. Through the gradual oversampling of previous chunks using the current minority events, the class distribution of past chunks can be balanced. Favorable results in comparison to other algorithms suggest that GRE can maintain good performance on minority class, without sacrificing majority class performance. (c) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:150 / 166
页数:17
相关论文
共 50 条
  • [1] Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift
    Jiao, Botao
    Guo, Yinan
    Gong, Dunwei
    Chen, Qiuju
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 1278 - 1291
  • [2] Batch Weighted Ensemble for Mining Data Streams with Concept Drift
    Deckert, Magdalena
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, 2011, 6804 : 290 - 299
  • [3] On Ensemble Components Selection in Data Streams Scenario with Gradual Concept-Drift
    Duda, Piotr
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2018), PT II, 2018, 10842 : 311 - 320
  • [4] An Ensemble Classifier Algorithm for Mining data Streams Based on Concept Drift
    Geng, Yushui
    Zhang, Jianguo
    [J]. 2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2017, : 227 - 230
  • [5] Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm
    Li, Zeng
    Huang, Wenchao
    Xiong, Yan
    Ren, Siqi
    Zhu, Tuanfei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 195
  • [6] ENSEMBLE ALGORITHM FOR DATA STREAMS WITH CONCEPT DRIFT
    Tase, R. O. R.
    Cabrera, A. V.
    Naranjo, D. L. O.
    Diaz, A. A. O.
    Blanco, I. F.
    [J]. HOLOS, 2016, 32 (02) : 24 - 36
  • [7] An ensemble classifier framework for mining imbalanced data streams
    Ouyang, Zhen-Zheng
    Luo, Jian-Shu
    Hu, Dong-Min
    Wu, Quan-Yuan
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (01): : 184 - 189
  • [8] Cost-sensitive continuous ensemble kernel learning for imbalanced data streams with concept drift
    Chen, Yingying
    Yang, Xiaowei
    Dai, Hong-Liang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [9] A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams
    Junaid, K. A. Mohamed
    Paulraj, D.
    Sethukarasi, T.
    [J]. WIRELESS NETWORKS, 2024,
  • [10] Incremental Weighted Ensemble for Data Streams With Concept Drift
    Jiao B.
    Guo Y.
    Yang C.
    Pu J.
    Zheng Z.
    Gong D.
    [J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 92 - 103