Online Ensemble Learning of Data Streams with Gradually Evolved Classes

被引:99
|
作者
Sun, Yu [1 ]
Tang, Ke [1 ]
Minku, Leandro L. [2 ]
Wang, Shuo [3 ]
Yao, Xin [3 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, USTC Birmingham Joint Res Inst Intelligent Comput, Hefei 230027, Peoples R China
[2] Univ Leicester, Dept Comp Sci, Univ Rd, Leicester LE1 7RH, Leics, England
[3] Univ Birmingham, Sch Comp Sci, Ctr Excellence Res Computat Intelligence & Applic, Birmingham B15 2TT, W Midlands, England
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Data stream mining; class evolution; ensemble model; on-line learning; imbalanced classification; CONCEPT DRIFT; CLASSIFICATION; CLASSIFIERS;
D O I
10.1109/TKDE.2016.2526675
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class evolution, the phenomenon of class emergence and disappearance, is an important research topic for data stream mining. All previous studies implicitly regard class evolution as a transient change, which is not true for many real-world problems. This paper concerns the scenario where classes emerge or disappear gradually. A class-based ensemble approach, namely Class-Based ensemble for Class Evolution (CBCE), is proposed. By maintaining a base learner for each class and dynamically updating the base learners with new data, CBCE can rapidly adjust to class evolution. A novel under-sampling method for the base learners is also proposed to handle the dynamic class-imbalance problem caused by the gradual evolution of classes. Empirical studies demonstrate the effectiveness of CBCE in various class evolution scenarios in comparison to existing class evolution adaptation methods.
引用
收藏
页码:1532 / 1545
页数:14
相关论文
共 50 条
  • [1] Online Active Learning Ensemble Framework for Drifted Data Streams
    Shan, Jicheng
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (02) : 486 - 498
  • [2] The online performance estimation framework: heterogeneous ensemble learning for data streams
    Jan N. van Rijn
    Geoffrey Holmes
    Bernhard Pfahringer
    Joaquin Vanschoren
    [J]. Machine Learning, 2018, 107 : 149 - 176
  • [3] Personalized online ensemble machine learning with applications for dynamic data streams
    Malenica, Ivana
    Phillips, Rachael V. V.
    Chambaz, Antoine
    Hubbard, Alan E. E.
    Pirracchio, Romain
    van der Laan, Mark J. J.
    [J]. STATISTICS IN MEDICINE, 2023, 42 (07) : 1013 - 1044
  • [4] The online performance estimation framework: heterogeneous ensemble learning for data streams
    van Rijn, Jan N.
    Holmes, Geoffrey
    Pfahringer, Bernhard
    Vanschoren, Joaquin
    [J]. MACHINE LEARNING, 2018, 107 (01) : 149 - 176
  • [5] Online ensemble learning with abstaining classifiers for drifting and noisy data streams
    Krawczyk, Bartosz
    Cano, Alberto
    [J]. APPLIED SOFT COMPUTING, 2018, 68 : 677 - 692
  • [6] Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3971 - 3983
  • [7] Online Active Learning with Drifted Data Streams Using Paired Ensemble Framework
    Shan, Ji-Cheng
    Liu, Wei-Ke
    Chu, Chen-Xi
    Dai, Chao-Fan
    Liu, Qing-Bao
    [J]. 4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
  • [8] Active and adaptive ensemble learning for online activity recognition from data streams
    Krawczyk, Bartosz
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 138 : 69 - 78
  • [9] AN IMPROVING ONLINE ACCURACY UPDATED ENSEMBLE METHOD IN LEARNING FROM EVOLVING DATA STREAMS
    Gu, Xiao-Feng
    Xu, Jia-Wen
    Huang, Shi-Jing
    Wang, Liao-Ming
    [J]. 2014 11TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2014, : 430 - 433
  • [10] Droplet Ensemble Learning on Drifting Data Streams
    Loeffel, Pierre-Xavier
    Bifet, Albert
    Marsala, Christophe
    Detyniecki, Marcin
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XVI, IDA 2017, 2017, 10584 : 210 - 222