Dynamic weighted selective ensemble learning algorithm for imbalanced data streams

被引:5
|
作者
Yan, Zhang [1 ,2 ]
Du Hongle [1 ,2 ]
Gang, Ke [3 ]
Lin, Zhang [1 ,2 ]
Chen, Yeh-Cheng [4 ]
机构
[1] Shangluo Univ, Sch Math & Comp Applicat, Shangluo City, Shaanxi, Peoples R China
[2] Shangluo Publ Big Data Res Ctr, Shangluo City, Shaanxi, Peoples R China
[3] Dongguan Polytech, Dept Comp Engn, Dongguan, Guangdong, Peoples R China
[4] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
来源
JOURNAL OF SUPERCOMPUTING | 2022年 / 78卷 / 04期
关键词
Concept drift; Imbalanced data stream; Data stream mining; Ensemble learning; CONCEPT DRIFT;
D O I
10.1007/s11227-021-04084-w
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data stream mining is one of the hot topics in data mining. Most existing algorithms assume that data stream with concept drift is balanced. However, in real-world, the data streams are imbalanced with concept drift. The learning algorithm will be more complex for the imbalanced data stream with concept drift. In online learning algorithm, the oversampling method is used to select a small number of samples from the previous data block through a certain strategy and add them into the current data block to amplify the current minority class. However, in this method, the number of stored samples, the method of oversampling and the weight calculation of base-classifier all affect the classification performance of ensemble classifier. This paper proposes a dynamic weighted selective ensemble (DWSE) learning algorithm for imbalanced data stream with concept drift. On the one hand, through resampling the minority samples in previous data block, the minority samples of the current data block can be amplified, and the information in the previous data block can be absorbed into building a classifier to reduce the impact of concept drift. The calculation method of information content of every sample is defined, and the resampling method and updating method of the minority samples are given in this paper. On the other hand, because of concept drift, the performance of the base-classifier will be degraded, and the decay factor is usually used to describe the performance degradation of base-classifier. However, the static decay factor cannot accurately describe the performance degradation of the base-classifier with the concept drift. The calculation method of dynamic decay factor of the base-classifier is defined in DWSE algorithm to select sub-classifiers to eliminate according to the attenuation situation, which makes the algorithm better deal with concept drift. Compared with other algorithms, the results show that the DWSE algorithm has better classification performance for majority class samples and minority samples.
引用
收藏
页码:5394 / 5419
页数:26
相关论文
共 50 条
  • [1] Dynamic weighted selective ensemble learning algorithm for imbalanced data streams
    Zhang Yan
    Du Hongle
    Ke Gang
    Zhang Lin
    Yeh-Cheng Chen
    [J]. The Journal of Supercomputing, 2022, 78 : 5394 - 5419
  • [2] Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm
    Li, Zeng
    Huang, Wenchao
    Xiong, Yan
    Ren, Siqi
    Zhu, Tuanfei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 195
  • [3] A dynamic ensemble algorithm for anomaly detection in IoT imbalanced data streams
    Jiang, Jun
    Liu, Fagui
    Liu, Yongheng
    Tang, Quan
    Wang, Bin
    Zhong, Guoxiang
    Wang, Weizheng
    [J]. COMPUTER COMMUNICATIONS, 2022, 194 : 250 - 257
  • [4] Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift
    Lu, Yang
    Cheung, Yiu-ming
    Tang, Yuan Yan
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2393 - 2399
  • [5] Selective Ensemble Learning Algorithm for Imbalanced Dataset
    Du, Hongle
    Zhang, Yan
    Zhang, Lin
    Chen, Yeh-Cheng
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (02) : 831 - 856
  • [6] A selective ensemble learning algorithm for imbalanced dataset
    Hongle, Du
    Yan, Zhang
    Gang, Ke
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021,
  • [7] Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift
    Jiao, Botao
    Guo, Yinan
    Gong, Dunwei
    Chen, Qiuju
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 1278 - 1291
  • [8] Weighted Ensemble with Dynamical Chunk Size for Imbalanced Data Streams in Nonstationary Environment
    Liu, Nini
    Zhu, Wen
    Liao, Bo
    Ren, Siqi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE & APPLICATION TECHNOLOGY (ICCIA 2017), 2017, 74 : 364 - 367
  • [9] Incremental Weighted Ensemble Broad Learning System for Imbalanced Data
    Yang, Kaixiang
    Yu, Zhiwen
    Chen, C. L. Philip
    Cao, Wenming
    You, Jane
    Wong, Hau-San
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (12) : 5809 - 5824
  • [10] Online ensemble learning algorithm for imbalanced data stream
    Hongle, Du
    Yan, Zhang
    Gang, Ke
    Lin, Zhang
    Chen, Yeh-Cheng
    [J]. APPLIED SOFT COMPUTING, 2021, 107