CARBO: Clustering and rotation based oversampling for class imbalance learning

被引:0
|
作者
Paul, Mahit Kumar [1 ]
Pal, Biprodip [1 ]
Sattar, A. H. M. Sarowar [1 ]
Siddique, A. S. M. Mustakim Rahman [1 ]
Hasan, Md. Al Mehedi [1 ]
机构
[1] Rajshahi Univ Engn & Technol, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
关键词
Class imbalance; Imbalanced data; Oversampling; Undersampling; CLASSIFICATION; SMOTE;
D O I
10.1016/j.knosys.2024.112196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance of a data set is a crucial problem in machine learning where one class significantly outnumbers others. In such a data set, classification is a troublesome task for the standard classification algorithms, leading to bias towards the majority class. Different methods have been developed so far, such as oversampling, undersampling, and cost-sensitive learning, to deal with class imbalance circumstances. Among these techniques, oversampling technique does not suffer from the information loss and critical cost selection challenges. However, appropriate synthetic sample generation can be challenging and vulnerable to privacy leakage. This research proposed an oversampling technique, called CARBO, using threshold-based geometric rotation and majority class influenced clustering. Unlike the existing resampling approaches to class imbalance problem, we contribute to consider the data privacy and optimal sample generation together for effective oversampling. The performance of CARBO is evaluated using 44 benchmark imbalanced data set. The empirical analysis elucidates that CARBO can make boosting-based C4.5 ensemble classifiers perform higher for 73% of the data set than six state-of-the-art approaches. In addition, the theoretical compatibility analysis of CARBO demonstrates its robustness.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Clustering-Based Oversampling Algorithm for Multi-class Imbalance Learning
    Zhao, Haixia
    Wu, Jian
    [J]. JOURNAL OF CLASSIFICATION, 2024,
  • [2] Stop Oversampling for Class Imbalance Learning: A Review
    Tarawneh, Ahmad S.
    Hassanat, Ahmad B.
    Altarawneh, Ghada Awad
    Almuhaimeed, Abdullah
    [J]. IEEE ACCESS, 2022, 10 : 47643 - 47660
  • [3] A novel oversampling technique based on the manifold distance for class imbalance learning
    Guo, Yinan
    Jiao, Botao
    Yang, Lingkai
    Cheng, Jian
    Yang, Shengxiang
    Tang, Fengzhen
    [J]. INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2021, 18 (03) : 131 - 142
  • [4] A Bag Oversampling Approach for Class Imbalance in Multiple Instance Learning
    Mera, Carlos
    Arrieta, Jose
    Orozco-Alzate, Mauricio
    Branch, John
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 724 - 731
  • [5] Cluster-based oversampling with area extraction from representative points for class imbalance learning
    Farou, Zakarya
    Wang, Yizhi
    Horvath, Tomas
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [6] A Boosting based Adaptive Oversampling Technique for Treatment of Class Imbalance
    Devi, Debashree
    Biswas, Saroj K.
    Purkayastha, Biswajit
    [J]. 2019 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2019), 2019,
  • [7] A clustering based ensemble of weighted kernelized extreme learning machine for class imbalance learning
    Choudhary, Roshani
    Shukla, Sanyam
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 164
  • [8] Correlation-based Oversampling aided Cost Sensitive Ensemble learning technique for Treatment of Class Imbalance
    Devi, Debashree
    Biswas, Saroj K.
    Purkayastha, Biswajit
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (01) : 143 - 174
  • [9] A density-based oversampling approach for class imbalance and data overlap
    Zhang, Ruizhi
    Lu, Shaowu
    Yan, Baokang
    Yu, Puliang
    Tang, Xiaoqi
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 186
  • [10] Extreme Anomalous Oversampling Technique for Class Imbalance
    Chiamanusorn, Chittima
    Sinapiromsaran, Krung
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 341 - 345