CARBO: Clustering and rotation based oversampling for class imbalance learning

被引:0
|
作者
Paul, Mahit Kumar [1 ]
Pal, Biprodip [1 ]
Sattar, A. H. M. Sarowar [1 ]
Siddique, A. S. M. Mustakim Rahman [1 ]
Hasan, Md. Al Mehedi [1 ]
机构
[1] Rajshahi Univ Engn & Technol, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
关键词
Class imbalance; Imbalanced data; Oversampling; Undersampling; CLASSIFICATION; SMOTE;
D O I
10.1016/j.knosys.2024.112196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance of a data set is a crucial problem in machine learning where one class significantly outnumbers others. In such a data set, classification is a troublesome task for the standard classification algorithms, leading to bias towards the majority class. Different methods have been developed so far, such as oversampling, undersampling, and cost-sensitive learning, to deal with class imbalance circumstances. Among these techniques, oversampling technique does not suffer from the information loss and critical cost selection challenges. However, appropriate synthetic sample generation can be challenging and vulnerable to privacy leakage. This research proposed an oversampling technique, called CARBO, using threshold-based geometric rotation and majority class influenced clustering. Unlike the existing resampling approaches to class imbalance problem, we contribute to consider the data privacy and optimal sample generation together for effective oversampling. The performance of CARBO is evaluated using 44 benchmark imbalanced data set. The empirical analysis elucidates that CARBO can make boosting-based C4.5 ensemble classifiers perform higher for 73% of the data set than six state-of-the-art approaches. In addition, the theoretical compatibility analysis of CARBO demonstrates its robustness.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Evolutionary Cluster-Based Synthetic Oversampling Ensemble (ECO-Ensemble) for Imbalance Learning
    Lim, Pin
    Goh, Chi Keong
    Tan, Kay Chen
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (09) : 2850 - 2861
  • [42] Learning Minority Class prior to Minority Oversampling
    Sadhukhan, Payel
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [43] A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance
    Elreedy, Dina
    Atiya, Amir F.
    [J]. INFORMATION SCIENCES, 2019, 505 : 32 - 64
  • [44] NanBDOS: Adaptive and parameter-free borderline oversampling via natural neighbor search for class-imbalance learning
    Leng, Qiangkui
    Guo, Jiamei
    Jiao, Erjie
    Meng, Xiangfu
    Wang, Changzhong
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 274
  • [45] Adaptive Centre-Weighted Oversampling for Class Imbalance in Software Defect Prediction
    Zhao, Qi
    Yan, Xuefeng
    Zhou, Yong
    [J]. 2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 223 - 230
  • [46] Novel fuzzy clustering-based undersampling framework for class imbalance problem
    Vibha Pratap
    Amit Prakash Singh
    [J]. International Journal of System Assurance Engineering and Management, 2023, 14 : 967 - 976
  • [47] Novel fuzzy clustering-based undersampling framework for class imbalance problem
    Pratap, Vibha
    Singh, Amit Prakash
    [J]. INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (03) : 967 - 976
  • [48] Imputation-Based Ensemble Techniques for Class Imbalance Learning
    Razavi-Far, Roozbeh
    Farajzadeh-Zanajni, Maryam
    Wang, Boyu
    Saif, Mehrdad
    Chakrabarti, Shiladitya
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 1988 - 2001
  • [49] Prediction of rhinitis with class imbalance based on heterogeneous ensemble learning
    Yang, Jingdong
    Jiang, Biao
    Qiu, Zehao
    Meng, Yifei
    Zhang, Xiaolin
    Yu, Shaoqing
    Dai, Fu
    Qian, Yue
    [J]. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2024,
  • [50] MAHAKIL: Diversity based Oversampling Approach to Alleviate the Class Imbalance Issue in Software Defect Prediction Extended Abstract
    Bennin, Kwabena E.
    Keung, Jacky
    Phannachitta, Passakorn
    Monden, Akito
    Mensah, Solomon
    [J]. PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 699 - 699