An effective cost-sensitive sparse online learning framework for imbalanced streaming data classification and its application to online anomaly detection

被引:0
|
作者
Zhong Chen
Victor Sheng
Andrea Edwards
Kun Zhang
机构
[1] Xavier University of Louisiana,Department of Computer Science
[2] Texas Tech University,Department of Computer Science
来源
关键词
Online learning; Streaming data; Imbalance ratio; Cost-sensitive learning; Sparsity; Regularized dual averaging; Online anomaly detection;
D O I
暂无
中图分类号
学科分类号
摘要
Class imbalance is one of the most challenging problems in streaming data mining due to its adverse impact on predictive capability of online models. Most of the existing approaches for online learning lack an effective mechanism to handle high-dimensional streaming data with skewed class distributions, resulting in deteriorated model performance and limited interpretability. In this paper, we develop a cost-sensitive regularized dual averaging (CSRDA) method to tackle this problem. Our proposed method substantially extends the influential regularized dual averaging method by formulating a new convex optimization function, in which four ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-norm regularized cost-sensitive objective functions are directly optimized, respectively. We then theoretically analyze CSRDA’s regret bounds and the bounds of primal variables, demonstrating that CSRDA and its variants can achieve a theoretical convergence in terms of the balanced cost and sparsity when handling severe imbalanced and high-dimensional streaming data. To validate the proposed methods, we conduct extensive experiments on six benchmark streaming datasets with varied imbalance ratios and three online anomaly detection tasks. The experimental results demonstrate that, compared to other baseline methods, CSRDA and its variants not only improve classification performance, but also successfully capture sparse features more effectively and hence potentially have a better model interpretability.
引用
收藏
页码:59 / 87
页数:28
相关论文
共 50 条
  • [1] An effective cost-sensitive sparse online learning framework for imbalanced streaming data classification and its application to online anomaly detection
    Chen, Zhong
    Sheng, Victor
    Edwards, Andrea
    Zhang, Kun
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (01) : 59 - 87
  • [2] Cost-sensitive sparse group online learning for imbalanced data streams
    Chen, Zhong
    Sheng, Victor
    Edwards, Andrea
    Zhang, Kun
    [J]. MACHINE LEARNING, 2024, 113 (07) : 4407 - 4444
  • [3] A Framework of Online Learning with Imbalanced Streaming Data
    Yan, Yan
    Yang, Tianbao
    Yang, Yi
    Chen, Jianhui
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2817 - 2823
  • [4] Cost-Sensitive Online Adaptive Kernel Learning for Large-Scale Imbalanced Classification
    Chen, Yingying
    Hong, Zijie
    Yang, Xiaowei
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (10) : 10554 - 10568
  • [5] Cost-Sensitive Online Classification
    Wang, Jialei
    Zhao, Peilin
    Hoi, Steven C. H.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (10) : 2425 - 2438
  • [6] Cost-Sensitive Online Classification
    Wang, Jialei
    Zhao, Peilin
    Hoi, Steven C. H.
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 1140 - 1145
  • [7] Cost-Sensitive Broad Learning System for Imbalanced Classification and Its Medical Application
    Yao, Liang
    Wong, Pak Kin
    Zhao, Baoliang
    Wang, Ziwen
    Lei, Long
    Wang, Xiaozheng
    Hu, Ying
    [J]. MATHEMATICS, 2022, 10 (05)
  • [8] Cost-Sensitive Online Active Learning with Application to Malicious URL Detection
    Zhao, Peilin
    Hoi, Steven C. H.
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 919 - 927
  • [9] Adaptive Cost-Sensitive Online Classification
    Zhao, Peilin
    Zhang, Yifan
    Wu, Min
    Hoi, Steven C. H.
    Tan, Mingkui
    Huang, Junzhou
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (02) : 214 - 228
  • [10] Cost-Sensitive Learning for Anomaly Detection in Imbalanced ECG Data Using Convolutional Neural Networks
    Zubair, Muhammad
    Yoon, Changwoo
    [J]. SENSORS, 2022, 22 (11)