Online reliable semi-supervised learning on evolving data streams

被引:40
|
作者
Din, Salah Ud [1 ]
Shao, Junming [1 ]
Kumar, Jay [1 ]
Ali, Waqar [2 ]
Liu, Jiaming [1 ]
Ye, Yu [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Data Min Lab, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Ctr Future Media, Sch Comp Sci & Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised learning; Micro-clusters; Concept drift; Evolving data streams; CONCEPT DRIFTING DATA; NONSTATIONARY DATA; CLASSIFICATION; ENSEMBLE;
D O I
10.1016/j.ins.2020.03.052
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In todays digital era, a massive amount of streaming data is automatically and continuously generated. To learn such data streams, many algorithms have been proposed during the last decade. Due to the dynamic nature of streaming data, the learning algorithms must be adaptive to handle concept drift and work under limited memory and time. Currently, most existing works assume that the true class labels of all incoming instances are immediately available. In real-world applications, labeling every data item in data streams is time and resource consuming. A more realistic situation is that only a few instances in data streams are labeled. Thereby, how to design a new efficient and effective learning algorithm that can handle concept drift, label scarcity, and work under limited resources is of significant importance. In this paper, we propose a new online semi-supervised learning algorithm by modeling concept drifts with a set of micro-clusters. These micro-clusters are dynamically maintained to capture the evolving concepts with error-based representative learning. In this way, local concept drifts are captured more quickly and finally support effective data stream learning. Extensive experiments on several data sets demonstrate that our learning model allows yielding high classification performance compared to many state-of-the-art algorithms. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:153 / 171
页数:19
相关论文
共 50 条
  • [1] Semi-supervised federated learning on evolving data streams
    Mawuli, Cobbinah B.
    Kumar, Jay
    Nanor, Ebenezer
    Fu, Shangxuan
    Pan, Liangxu
    Yang, Qinli
    Zhang, Wei
    Shao, Junming
    [J]. INFORMATION SCIENCES, 2023, 643
  • [2] Online semi-supervised active learning ensemble classification for evolving imbalanced data streams
    Guo, Yinan
    Pu, Jiayang
    Jiao, Botao
    Peng, Yanyan
    Wang, Dini
    Yang, Shengxiang
    [J]. APPLIED SOFT COMPUTING, 2024, 155
  • [3] RELIABLE SEMI-SUPERVISED LEARNING ON IMBALANCED EVOLVING DATA STREAM
    Pan Liangxu
    [J]. 2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [4] Semi-supervised Learning Algorithm for Online Electricity Data Streams
    Patil, Pramod
    Fatangare, Yogita
    Kulkarni, Parag
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 1, 2015, 324 : 349 - 358
  • [5] Online Semi-supervised Learning from Evolving Data Streams with Meta-features and Deep Reinforcement Learning
    Vafaie, Parsa
    Viktor, Herna
    Paquet, Eric
    Michalowski, Wojtek
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 70 - 85
  • [6] A novel semi-supervised classification approach for evolving data streams
    Liao, Guobo
    Zhang, Peng
    Yin, Hongpeng
    Deng, Xuanhong
    Li, Yanxia
    Zhou, Han
    Zhao, Dandan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
  • [7] Reliable Semi-supervised Learning
    Shao, Junming
    Huang, Chen
    Yang, Qinli
    Luo, Guangchun
    [J]. 2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1197 - 1202
  • [8] Semi-Supervised Evolving Approach for Data Streams Classification Based on Online Gustafson-Kessel Algorithm
    Gorbunov, I. V.
    Kalmykov, M. O.
    Rasskazov, E. V.
    Yankovskaya, A. E.
    [J]. 2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 206 - 209
  • [9] OSSEFS: An online semi-supervised ensemble fuzzy system for data streams learning with missing values
    Yan, Lu
    Zhao, Tao
    Xie, Xiangpeng
    Precup, Radu-Emil
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [10] Online Semi-supervised Learning for Multi-target Regression in Data Streams Using AMRules
    Sousa, Ricardo
    Gama, Joao
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XV, 2016, 9897 : 123 - 133