ADAPTIVE DATA REUSE FOR CLASSIFYING IMBALANCED AND CONCEPT-DRIFTING DATA STREAMS

被引:0
|
作者
Nguyen, Hien M. [1 ]
Cooper, Eric W. [2 ]
Kamei, Katsuari [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Sci & Engn, Kusatsu, Shiga 5258577, Japan
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kusatsu, Shiga 5258577, Japan
关键词
Adaptive data reuse; Data selection; Class imbalance; Concept drift; Data stream; Ensemble learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining data streams has recently been the subject of extensive research efforts. However, most of the works conducted in this field assume a balanced class distribution underlying data streams. In this paper, therefore, we propose a new method for learning from imbalanced data streams. To deal with the problem of class imbalance, we select and reuse past data to improve the representation of the minority class. Different from previous methods, our method has the ability to automatically adapt data selection for concept drift. A data stream may experience a complicated concept drift, making data selection more difficult. Therefore, we consider several different candidate solutions of data selection, each of which is possibly more appropriate for certain data streaming conditions. In other words, no one of them is the best at all times. We make comparisons and identify the best candidate solution by cross-validation on the most recent training data. By experimental evaluations on simulated and real-world data streams, we show that our method achieves better performance than previous methods, especially when concept drift occurs.
引用
收藏
页码:4995 / 5010
页数:16
相关论文
共 50 条
  • [1] Generalized CMAC Adaptive Ensembles for Concept-Drifting Data Streams
    Gonzalez-Serrano, Francisco J.
    Figueiras-Vidal, Anibal R.
    [J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 2669 - 2673
  • [2] An adaptive distributed ensemble approach to mine concept-drifting data streams
    Folino, Gianluigi
    Pizzuti, Clara
    Spezzano, Giandomenico
    [J]. 19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 183 - 187
  • [3] Granularity adaptive density estimation and on demand clustering of concept-drifting data streams
    Zhu, Weiheng
    Pei, Jian
    Yin, Jian
    Xie, Yihuang
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 322 - 331
  • [4] ACCD: Associative Classification over Concept-Drifting Data Streams
    Waiyamai, Kitsana
    Kangkachit, Thanapat
    Saengthongloun, Bordin
    Rakthanmanon, Thanawin
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2014, 2014, 8556 : 78 - 90
  • [5] A Model-Selection Framework for Concept-Drifting Data Streams
    Chen, Bo-Heng
    Chuang, Kun-Ta
    [J]. 2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2014, : 290 - 296
  • [6] Prototype-based Learning on Concept-drifting Data Streams
    Shao, Junming
    Ahmadi, Zahra
    Kramer, Stefan
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 412 - 421
  • [7] Ambiguous decision trees for mining concept-drifting data streams
    Liu, Jing
    Li, Xue
    Zhong, Weicai
    [J]. PATTERN RECOGNITION LETTERS, 2009, 30 (15) : 1347 - 1355
  • [8] On reducing classifier granularity in mining concept-drifting data streams
    Wang, P
    Wang, HX
    Wu, XC
    Wang, W
    Shi, BL
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 474 - 481
  • [9] Anomalies Detection Using Isolation in Concept-Drifting Data Streams
    Togbe, Maurras Ulbricht
    Chabchoub, Yousra
    Boly, Aliou
    Barry, Mariam
    Chiky, Raja
    Bahri, Maroua
    [J]. COMPUTERS, 2021, 10 (01)
  • [10] Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams
    Masud, Mohammad M.
    Gao, Jing
    Khan, Latifur
    Han, Jiawei
    Thuraisingham, Bhavani
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 79 - +