An Adaptive Active Learning Method for Multiclass Imbalanced Data Streams with Concept Drift

被引:0
|
作者
Han, Meng [1 ]
Li, Chunpeng [1 ]
Meng, Fanxing [1 ]
He, Feifei [1 ]
Zhang, Ruihua [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期
关键词
data stream classification; multiclass imbalance; concept drift; ensemble learning; active learning; CLASSIFICATION;
D O I
10.3390/app14167176
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Learning from multiclass imbalanced data streams with concept drift and variable class imbalance ratios under a limited label budget presents new challenges in the field of data mining. To address these challenges, this paper proposes an adaptive active learning method for multiclass imbalanced data streams with concept drift (AdaAL-MID). Firstly, a dynamic label budget strategy under concept drift scenarios is introduced, which allocates label budgets reasonably at different stages of the data stream to effectively handle concept drift. Secondly, an uncertainty-based label request strategy using a dual-margin dynamic threshold matrix is designed to enhance learning opportunities for minority class instances and those that are challenging to classify, and combined with a random strategy, it can estimate the current class imbalance distribution by accessing only a limited number of instance labels. Finally, an instance-adaptive sampling strategy is proposed, which comprehensively considers the imbalance ratio and classification difficulty of instances, and combined with a weighted ensemble strategy, improves the classification performance of the ensemble classifier in imbalanced data streams. Extensive experiments and analyses demonstrate that AdaAL-MID can handle various complex concept drifts and adapt to changes in class imbalance ratios, and it outperforms several state-of-the-art active learning algorithms.
引用
收藏
页数:32
相关论文
共 50 条
  • [41] The PerfSim Algorithm for Concept Drift Detection in Imbalanced Data
    Antwi, Daniel K.
    Viktor, Herna L.
    Japkowicz, Nathalie
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 619 - 628
  • [42] ENSEMBLE ALGORITHM FOR DATA STREAMS WITH CONCEPT DRIFT
    Tase, R. O. R.
    Cabrera, A. V.
    Naranjo, D. L. O.
    Diaz, A. A. O.
    Blanco, I. F.
    [J]. HOLOS, 2016, 32 (02) : 24 - 36
  • [43] On Fuzzy Clustering of Data Streams with Concept Drift
    Jaworski, Maciej
    Duda, Piotr
    Pietruczuk, Lena
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 82 - 91
  • [44] Semi-supervised Ensemble Learning of Data Streams in the Presence of Concept Drift
    Ahmadi, Zahra
    Beigy, Hamid
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, 2012, 7209 : 526 - 537
  • [45] Measuring the Effectiveness of Adaptive Random Forest for Handling Concept Drift in Big Data Streams
    AlQabbany, Abdulaziz O.
    Azmi, Aqil M.
    [J]. ENTROPY, 2021, 23 (07)
  • [46] G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift
    Liang B.
    Li G.
    Dai C.
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (12): : 2844 - 2857
  • [47] Adaptive Random Forests with Resampling for Imbalanced data Streams
    Boiko Ferreira, Luis Eduardo
    Gomes, Heitor Murilo
    Bifet, Albert
    Oliveira, Luiz S.
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [48] Concept Drift Based Multi-dimensional Data Streams Sampling Method
    Lin, Ling
    Qi, Xiaolong
    Zhu, Zhirui
    Gao, Yang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 331 - 342
  • [49] A differential evolution based method for tuning concept drift detectors in data streams
    Santos, Silas G. T. C.
    Barros, Roberto S. M.
    Goncalves, Paulo M., Jr.
    [J]. INFORMATION SCIENCES, 2019, 485 : 376 - 393
  • [50] A novel concept drift detection method in data streams using ensemble classifiers
    Dehghan, Mahdie
    Beigy, Hamid
    ZareMoodi, Poorya
    [J]. INTELLIGENT DATA ANALYSIS, 2016, 20 (06) : 1329 - 1350