Online active learning method for multi-class imbalanced data stream

被引:0
|
作者
Ang Li
Meng Han
Dongliang Mu
Zhihui Gao
Shujuan Liu
机构
[1] North Minzu University,School of Computer Science and Engineering
来源
关键词
Data stream classification; Active learning; Ensemble learning; Multi-class imbalance; Concept drift;
D O I
暂无
中图分类号
学科分类号
摘要
In the field of data mining, data stream classification is an important research direction. However, the presence of issues such as multi-class imbalance, concept drift, and variable class imbalance ratio in data streams can greatly impact the performance of classification models, and the high cost of sample labeling has always been a focus of research. To address these problems, an online active learning method for multi-class imbalanced data stream (OALM-MI) is proposed. Firstly, a comprehensive sample weighting method based on cross-entropy and margin values is proposed to weight each incoming sample in the data stream according to its classification difficulty and importance, which aims to enhance the learning ability of the classifier for important samples. Besides, a comprehensive weighting and updating strategy for ensemble classifiers is introduced, which combines mean square error, improved square error, recall, and the weights of the classifiers in the previous sliding window of samples to weight and update the classifiers. Additionally, adaptive window is utilized to detect and handle concept drift, enabling better adaptation to the changes in the data stream during the learning process. Finally, a margin matrix label request strategy based on class imbalance ratio is proposed to assign labels to samples according to their imbalance ratio and classification difficulty, which can provide more learning opportunities for minority class samples and important samples. Comprehensive experiments were conducted on 12 synthetic data streams and six real data streams with seven state-of-the-art algorithms, and the results showed that the OALM-MI algorithm achieved the highest performance in terms of recall, precision, F1-score, Kappa, and G-mean.
引用
收藏
页码:2355 / 2391
页数:36
相关论文
共 50 条
  • [1] Online active learning method for multi-class imbalanced data stream
    Li, Ang
    Han, Meng
    Mu, Dongliang
    Gao, Zhihui
    Liu, Shujuan
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (04) : 2355 - 2391
  • [2] An online ensemble classification algorithm for multi-class imbalanced data stream
    Han, Meng
    Li, Chunpeng
    Meng, Fanxing
    He, Feifei
    Zhang, Ruihua
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6845 - 6880
  • [3] A Combination Method for Multi-Class Imbalanced Data Classification
    Li, Hu
    Zou, Peng
    Han, Weihong
    Xia, Rongze
    [J]. 2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013), 2013, : 365 - 368
  • [4] Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights
    Liu, Xu-Ying
    Li, Qian-Qian
    Zhou, Zhi-Hua
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 478 - 487
  • [5] Multi-class Boosting for Imbalanced Data
    Fernandez-Baldera, Antonio
    Buenaposada, Jose M.
    Baumela, Luis
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 57 - 64
  • [6] Learning from Combination of Data Chunks for Multi-class Imbalanced Data
    Liu, Xu-Ying
    Li, Qian-Qian
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1680 - 1687
  • [7] Multi-class Ensemble Learning of Imbalanced Bidding Fraud Data
    Anowar, Farzana
    Sadaoui, Samira
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11489 : 352 - 358
  • [8] Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Zhang, Yibo
    Zhang, Ruifeng
    Chen, Si
    [J]. INTELLIGENT DATA ANALYSIS, 2022, 26 (03) : 599 - 614
  • [9] Online Multi-threshold Learning with Imbalanced Data Stream
    Cai, Xufen
    Yang, Min
    Zhu, Rong
    Li, Xiaoyan
    Ye, Long
    Zhang, Qin
    [J]. ADVANCES IN NEURAL NETWORKS, PT I, 2017, 10261 : 3 - 9
  • [10] Hybrid Sampling and Dynamic Weighting-Based Classification Method for Multi-Class Imbalanced Data Stream
    Han, Meng
    Li, Ang
    Gao, Zhihui
    Mu, Dongliang
    Liu, Shujuan
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (10):