Cost-sensitive learning for imbalanced data streams

被引:23
|
作者
Loezer, Lucas [1 ]
Enembreck, Fabricio [1 ]
Barddal, Jean Paul [1 ]
Britto Jr, Alceu de Souza [1 ]
机构
[1] Pontificia Univ Catolica Parana, Grad Program Informat PPGIa, Curitiba, Parana, Brazil
关键词
cost-sensitive; ensemble; data stream; imbalanced datasets; adaptive random forest; CLASSIFICATION;
D O I
10.1145/3341105.3373949
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The data imbalance problem hampers the classification task. In streaming environments, this becomes even more cumbersome as the proportion of classes can vary over time. Approaches based on misclassification costs can be used to mitigate this problem. In this paper, we present the Cost-sensitive Adaptive Random Forest (CSARF) and compare it to the Adaptive Random Forest (ARF) and ARF with Resampling (ARF(RE)) in six real-world and six synthetic data sets with different class ratios. The empirical study analyzes two misclassification costs strategies of the CSARF and shows that the CSARF obtained statistically superior w.r.t. the average recall and average F1 when compared to ARF.
引用
收藏
页码:498 / 504
页数:7
相关论文
共 50 条
  • [41] Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets
    Mohammad Khubeb Siddiqui
    Xiaodi Huang
    Ruben Morales-Menendez
    Nasir Hussain
    Khudeja Khatoon
    [J]. International Journal on Interactive Design and Manufacturing (IJIDeM), 2020, 14 : 1491 - 1509
  • [42] CSIML: a cost-sensitive and iterative machine-learning method for small and imbalanced materials data sets
    Li, Shengzhou
    Nakata, Ayako
    [J]. CHEMISTRY LETTERS, 2024, 53 (05)
  • [43] Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets
    Siddiqui, Mohammad Khubeb
    Huang, Xiaodi
    Morales-Menendez, Ruben
    Hussain, Nasir
    Khatoon, Khudeja
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2020, 14 (04): : 1491 - 1509
  • [44] Improved cost-sensitive representation of data for solving the imbalanced big data classification problem
    Fattahi, Mahboubeh
    Moattar, Mohammad Hossein
    Forghani, Yahya
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [45] Reinforcement learning-based cost-sensitive classifier for imbalanced fault classification
    Xinmin Zhang
    Saite Fan
    Zhihuan Song
    [J]. Science China Information Sciences, 2023, 66
  • [46] Focused Anchors Loss: cost-sensitive learning of discriminative features for imbalanced classification
    Baloch, Bahram K.
    Kumar, Sateesh
    Haresh, Sanjay
    Rehman, Abeerah
    Syed, Tahir
    [J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 837 - 850
  • [47] Multi-view cost-sensitive kernel learning for imbalanced classification problem
    Tang, Jingjing
    Hou, Zhaojie
    Yu, Xiaotong
    Fu, Saiji
    Tian, Yingjie
    [J]. NEUROCOMPUTING, 2023, 552
  • [48] Reinforcement learning-based cost-sensitive classifier for imbalanced fault classification
    Xinmin ZHANG
    Saite FAN
    Zhihuan SONG
    [J]. Science China(Information Sciences), 2023, 66 (11) : 113 - 126
  • [49] Cost-Sensitive Learning from Imbalanced Datasets for Retail Credit Risk Assessment
    Oreski, Stjepan
    Oreski, Goran
    [J]. TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2018, 7 (01): : 59 - 73
  • [50] A cost-sensitive active learning algorithm: toward imbalanced time series forecasting
    Zhang, Jing
    Dai, Qun
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (09): : 6953 - 6972