ADAPTIVE DATA REUSE FOR CLASSIFYING IMBALANCED AND CONCEPT-DRIFTING DATA STREAMS

被引:0
|
作者
Nguyen, Hien M. [1 ]
Cooper, Eric W. [2 ]
Kamei, Katsuari [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Sci & Engn, Kusatsu, Shiga 5258577, Japan
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan
关键词
Adaptive data reuse; Data selection; Class imbalance; Concept drift; Data stream; Ensemble learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining data streams has recently been the subject of extensive research efforts. However, most of the works conducted in this field assume a balanced class distribution underlying data streams. In this paper, therefore, we propose a new method for learning from imbalanced data streams. To deal with the problem of class imbalance, we select and reuse past data to improve the representation of the minority class. Different from previous methods, our method has the ability to automatically adapt data selection for concept drift. A data stream may experience a complicated concept drift, making data selection more difficult. Therefore, we consider several different candidate solutions of data selection, each of which is possibly more appropriate for certain data streaming conditions. In other words, no one of them is the best at all times. We make comparisons and identify the best candidate solution by cross-validation on the most recent training data. By experimental evaluations on simulated and real-world data streams, we show that our method achieves better performance than previous methods, especially when concept drift occurs.
引用
收藏
页码:4995 / 5010
页数:16
相关论文
共 50 条
  • [41] Learning from concept drifting data streams with unlabeled data
    Wu, Xindong
    Li, Peipei
    Hu, Xuegang
    [J]. NEUROCOMPUTING, 2012, 92 : 145 - 155
  • [42] Active Learning with Abstaining Classifiers for Imbalanced Drifting Data Streams
    Korycki, Lukasz
    Cano, Alberto
    Krawczyk, Bartosz
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2334 - 2343
  • [43] An Algorithm for Anticipating Future Decision Trees from Concept-Drifting Data
    Boettcher, Mirko
    Spott, Martin
    Kruse, Rudolf
    [J]. RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXV, 2009, : 293 - +
  • [44] An Adaptive Active Learning Method for Multiclass Imbalanced Data Streams with Concept Drift
    Han, Meng
    Li, Chunpeng
    Meng, Fanxing
    He, Feifei
    Zhang, Ruihua
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (16):
  • [45] Structural XML Classification in Concept Drifting Data Streams
    Brzezinski, Dariusz
    Piernik, Maciej
    [J]. NEW GENERATION COMPUTING, 2015, 33 (04) : 345 - 366
  • [46] Structural XML Classification in Concept Drifting Data Streams
    Dariusz Brzezinski
    Maciej Piernik
    [J]. New Generation Computing, 2015, 33 : 345 - 366
  • [47] GP Boosting Classification on Concept Drifting Data Streams
    Kumar, Dirisala J. Nagendra
    Murthy, J. V. R.
    Satapathy, Suresh Chandra
    Pullela, S. V. V. S. R. Kumar
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 265 - +
  • [48] Ratio Rules Mining in Concept Drifting Data Streams
    Fan, Wei
    Watanabe, Toyohide
    Asakura, Koichi
    [J]. WCECS 2009: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, VOLS I AND II, 2009, : 809 - +
  • [49] Resample-Based Ensemble Framework for Drifting Imbalanced Data Streams
    Zhang, Hang
    Liu, Weike
    Wang, Shuo
    Shan, Jicheng
    Liu, Qingbao
    [J]. IEEE ACCESS, 2019, 7 : 65103 - 65115
  • [50] Reinforcement Online Active Learning Ensemble for Drifting Imbalanced Data Streams
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (08) : 3971 - 3983