Online semi-supervised active learning ensemble classification for evolving imbalanced data streams

被引:0
|
作者
Guo, Yinan [1 ,4 ]
Pu, Jiayang [1 ,3 ]
Jiao, Botao [2 ]
Peng, Yanyan [1 ]
Wang, Dini [1 ]
Yang, Shengxiang [1 ,5 ]
机构
[1] China Univ Min & Technol Beijing, Beijing 100083, Peoples R China
[2] China Univ Min & Technol, Xuzhou 221116, Peoples R China
[3] China Univ Min & Technol Beijing, Inner Mongolia Res Inst, Ordos 017010, Peoples R China
[4] Minist Educ China, Key Lab Syst Control & Informat Proc, Shanghai 200240, Peoples R China
[5] De Montfort Univ, Leicester LE1 9BH, England
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Semi-supervised; Active learning; Concept drift; Imbalance; Data stream; DYNAMIC WEIGHTED MAJORITY; FAULT-DIAGNOSIS; MACHINERY;
D O I
10.1016/j.asoc.2024.111452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Concept drift is a core challenge in classification tasks of data streams. Although many drift adaptation methods have been presented, most of them assume that labels of all data are available, which is impractical in many real -world applications. Additionally, the absence of label makes the imbalance ratio of an imbalanced data stream difficultly being obtained in time, providing the inaccurate guidance for resampling and causing poor generalization. To tackle the joint challenges, an online semi -supervised active learning method is proposed to classifier imbalanced data streams with concept drift. A newly -arrived data is first added to the sliding window, and then assigned a pseudo label in terms of its nearest cluster. Meanwhile, semi -supervised clustering algorithm offers its predicted label. Based on the above two predictive labels, cluster -based query strategy provides the criteria for the evaluation and selection of representative instances. More especially, the uncertainty and importance of instances are defined to synthetically evaluate its representativeness. After obtaining true labels of typical ones, ensemble classifier is updated by all instances in current sliding window. Experimental results on 13 synthetic and real data streams indicate that the proposed method outperforms six comparative methods on both G -mean and Recall under various labeling budgets.
引用
收藏
页数:12
相关论文
共 50 条
  • [11] Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3971 - 3983
  • [12] Active semi-supervised learning for biological data classification
    Camargo, Guilherme
    Bugatti, Pedro H.
    Saito, Priscila T. M.
    [J]. PLOS ONE, 2020, 15 (08):
  • [13] Hyperspectral Image Classification with Imbalanced Data Based on Semi-Supervised Learning
    Zheng, Xiaorou
    Jia, Jianxin
    Chen, Jinsong
    Guo, Shanxin
    Sun, Luyi
    Zhou, Chan
    Wang, Yawei
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [14] Online Semi-supervised Learning from Evolving Data Streams with Meta-features and Deep Reinforcement Learning
    Vafaie, Parsa
    Viktor, Herna
    Paquet, Eric
    Michalowski, Wojtek
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 70 - 85
  • [15] Online Semi-Supervised Classification on Multilabel Evolving High-Dimensional Text Streams
    Kumar, Jay
    Shao, Junming
    Kumar, Rajesh
    Din, Salah Ud
    Mawuli, Cobbinah B.
    Yang, Qinli
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (10): : 5983 - 5995
  • [16] Semi-Supervised Classification of Data Streams by BIRCH Ensemble and Local Structure Mapping
    Yi-Min Wen
    Shuai Liu
    [J]. Journal of Computer Science and Technology, 2020, 35 : 295 - 304
  • [17] Semi-Supervised Classification of Data Streams by BIRCH Ensemble and Local Structure Mapping
    Wen, Yi-Min
    Liu, Shuai
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (02) : 295 - 304
  • [18] Semi-supervised Online Learning for Efficient Classification of Objects in 3D Data Streams
    Tao, Ye
    Triebel, Rudolph
    Cremers, Daniel
    [J]. 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 2904 - 2910
  • [19] A Semi-supervised Ensemble Approach for Mining Data Streams
    Liu, Jing
    Xu, Guo-Sheng
    Xiao, Da
    Gu, Li-Ze
    Niu, Xin-Xin
    [J]. JOURNAL OF COMPUTERS, 2013, 8 (11) : 2873 - 2879
  • [20] Imbalanced fault diagnosis based on semi-supervised ensemble learning
    Chuanxia Jian
    Yinhui Ao
    [J]. Journal of Intelligent Manufacturing, 2023, 34 : 3143 - 3158