G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift

被引:0
|
作者
Liang B. [1 ]
Li G. [1 ]
Dai C. [1 ]
机构
[1] School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi
基金
中国国家自然科学基金;
关键词
Class imbalance; Classification; Concept drift; Data stream; Ensemble learning;
D O I
10.7544/issn1000-1239.20210471
中图分类号
学科分类号
摘要
Concept drift and class imbalance in data stream seriously degrade the performance and stability of the traditional data stream classification algorithms. To solve this issue in binary classification of data stream, an online G-mean weighted ensemble classification method for imbalanced data stream with concept drift termed OGUEIL is proposed. It exploits the online update mechanism of component classifiers' weights to modify block-based ensemble algorithms, combining the hybrid resampling and adaptive sliding window algorithm. OGUEIL is based on the ensemble learning framework that once a new instance reaches, each component classifier in the ensemble and its weight are correspondingly updated online, and the minority class instance is randomly oversampled at the same time. Particularly, each component classifier determines its weight according to the G-mean performance on several recently incoming instances, where G-mean of each component classifier is calculated based on the time decay factor increment. At the same time, OGUEIL periodically constructs a balanced dataset according to the data in the current sliding window and trains a new candidate classifier, then adds it to the ensemble based on specific conditions. The experimental results on both real-world and synthesized datasets show that the comprehensive performance of the proposed method outperforms other baseline algorithms. © 2022, Science Press. All right reserved.
引用
收藏
页码:2844 / 2857
页数:13
相关论文
共 28 条
  • [1] Sun Yange, Research on concept drift data stream classification algorithm, (2019)
  • [2] Guo Husheng, Ren Qiaoyan, Wang Wenjian, Concept drift category detection based on time series window, Journal of Computer Research and Development, 59, 1, pp. 127-143, (2022)
  • [3] Guo Husheng, Zhang Aijuan, Wang Wenjian, Concept drift detection method based on online performance test, Journal of Software, 31, 4, pp. 932-947, (2020)
  • [4] Pesaranghader A, Viktor H L., Fast hoeffding drift detection method for evolving data streams, Proc of the 21st Joint European Conf on Machine Learning and Knowledge Discovery in Databases, pp. 96-111, (2016)
  • [5] Lu Jie, Liu Anjin, Fan Dong, Et al., Learning under concept drift: A review, IEEE Transactions on Knowledge and Data Engineering, 31, 12, pp. 2346-2363, (2018)
  • [6] Ren Siqi, Research on data stream ensemble classification algorithm based on concept drift, (2018)
  • [7] Li Zeng, Huang Wenchao, Yan Xiong, Et al., Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm, Knowledge-Based Systems, 195, 4, pp. 105-120, (2020)
  • [8] Wang Shuo, Minku L L, Yao Xin, A systematic study of online class imbalance learning with concept drift, IEEE Transactions on Neural Networks and Learning Systems, 29, 10, pp. 4802-4821, (2018)
  • [9] Wang Shuo, Minku L L, Yao Xin, Resampling-based ensemble methods for online class imbalance learning, IEEE Transactions on Knowledge and Data Engineering, 27, 5, pp. 1356-1368, (2015)
  • [10] Oza N C., Online bagging and boosting, Proc of the 8th IEEE Int Conf on Systems, Man and Cybernetics, pp. 2340-2345, (2005)