Online semi-supervised active learning ensemble classification for evolving imbalanced data streams

被引:0
|
作者
Guo, Yinan [1 ,4 ]
Pu, Jiayang [1 ,3 ]
Jiao, Botao [2 ]
Peng, Yanyan [1 ]
Wang, Dini [1 ]
Yang, Shengxiang [1 ,5 ]
机构
[1] China Univ Min & Technol Beijing, Beijing 100083, Peoples R China
[2] China Univ Min & Technol, Xuzhou 221116, Peoples R China
[3] China Univ Min & Technol Beijing, Inner Mongolia Res Inst, Ordos 017010, Peoples R China
[4] Minist Educ China, Key Lab Syst Control & Informat Proc, Shanghai 200240, Peoples R China
[5] De Montfort Univ, Leicester LE1 9BH, England
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Semi-supervised; Active learning; Concept drift; Imbalance; Data stream; DYNAMIC WEIGHTED MAJORITY; FAULT-DIAGNOSIS; MACHINERY;
D O I
10.1016/j.asoc.2024.111452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Concept drift is a core challenge in classification tasks of data streams. Although many drift adaptation methods have been presented, most of them assume that labels of all data are available, which is impractical in many real -world applications. Additionally, the absence of label makes the imbalance ratio of an imbalanced data stream difficultly being obtained in time, providing the inaccurate guidance for resampling and causing poor generalization. To tackle the joint challenges, an online semi -supervised active learning method is proposed to classifier imbalanced data streams with concept drift. A newly -arrived data is first added to the sliding window, and then assigned a pseudo label in terms of its nearest cluster. Meanwhile, semi -supervised clustering algorithm offers its predicted label. Based on the above two predictive labels, cluster -based query strategy provides the criteria for the evaluation and selection of representative instances. More especially, the uncertainty and importance of instances are defined to synthetically evaluate its representativeness. After obtaining true labels of typical ones, ensemble classifier is updated by all instances in current sliding window. Experimental results on 13 synthetic and real data streams indicate that the proposed method outperforms six comparative methods on both G -mean and Recall under various labeling budgets.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Online reliable semi-supervised learning on evolving data streams
    Din, Salah Ud
    Shao, Junming
    Kumar, Jay
    Ali, Waqar
    Liu, Jiaming
    Ye, Yu
    [J]. INFORMATION SCIENCES, 2020, 525 : 153 - 171
  • [2] Semi-supervised federated learning on evolving data streams
    Mawuli, Cobbinah B.
    Kumar, Jay
    Nanor, Ebenezer
    Fu, Shangxuan
    Pan, Liangxu
    Yang, Qinli
    Zhang, Wei
    Shao, Junming
    [J]. INFORMATION SCIENCES, 2023, 643
  • [3] RELIABLE SEMI-SUPERVISED LEARNING ON IMBALANCED EVOLVING DATA STREAM
    Pan Liangxu
    [J]. 2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [4] Ensemble learning with active data selection for semi-supervised pattern classification
    Wang, Shihai
    Chen, Ke
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 355 - 360
  • [5] A novel semi-supervised classification approach for evolving data streams
    Liao, Guobo
    Zhang, Peng
    Yin, Hongpeng
    Deng, Xuanhong
    Li, Yanxia
    Zhou, Han
    Zhao, Dandan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
  • [6] Review of ensemble classification over data streams based on supervised and semi-supervised
    Han, Meng
    Li, Xiaojuan
    Wang, Le
    Zhang, Ni
    Cheng, Haodong
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 3859 - 3878
  • [7] OSSEFS: An online semi-supervised ensemble fuzzy system for data streams learning with missing values
    Yan, Lu
    Zhao, Tao
    Xie, Xiangpeng
    Precup, Radu-Emil
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [8] Semi-supervised Learning Algorithm for Online Electricity Data Streams
    Patil, Pramod
    Fatangare, Yogita
    Kulkarni, Parag
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 1, 2015, 324 : 349 - 358
  • [9] Semi-supervised Ensemble Learning of Data Streams in the Presence of Concept Drift
    Ahmadi, Zahra
    Beigy, Hamid
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, 2012, 7209 : 526 - 537
  • [10] Semi-Supervised Evolving Approach for Data Streams Classification Based on Online Gustafson-Kessel Algorithm
    Gorbunov, I. V.
    Kalmykov, M. O.
    Rasskazov, E. V.
    Yankovskaya, A. E.
    [J]. 2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 206 - 209