Novel fuzzy clustering-based undersampling framework for class imbalance problem

被引:2
|
作者
Pratap, Vibha [1 ,2 ]
Singh, Amit Prakash [1 ]
机构
[1] Guru Gobind Singh Indraprastha Univ, USICT, New Delhi, India
[2] Indira Gandhi Delhi Tech Univ Women, Delhi, India
关键词
Class imbalance; Ensemble method; Fuzzy C-mean; Machine learning; Oversampling; Under-sampling; CLASSIFICATION; PREDICTION; SMOTE;
D O I
10.1007/s13198-023-01897-1
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The class imbalance problem occurs in various real-world datasets. Although it is considered that samples of the classes of a dataset are evenly distributed, in many cases, datasets are highly imbalanced. Classification of such datasets is challenging in machine learning. Researchers have developed many approaches to solve the class imbalance problem, such as resampling and ensemble methods. In resampling methods, minority class samples are increased (oversampling), or majority class samples are reduced (under-sampling). In contrast, the ensemble methods classify various subsets of data where classification results are combined to provide the final result. The authors have introduced a new fuzzy C-mean clustering-based under-sampling method in the present study. We performed experiments using newly proposed method over 30 small-scale imbalanced datasets. The results obtained revealed that the proposed method improves the classification performance. The average sensitivity improved by 1% and the average balance accuracy improved by 3% as compared to k-means undersampling method. The results of this study would be useful in classification of imbalanced datasets of various domains.
引用
收藏
页码:967 / 976
页数:10
相关论文
共 50 条
  • [41] Novel clustering-based pruning algorithms
    Zyblewski, Pawel
    Wozniak, Michal
    PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (03) : 1049 - 1058
  • [42] A clustering-based model for class responsibility assignment problem in object-oriented analysis
    Masoud, Hamid
    Jalili, Saeed
    JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 93 : 110 - 131
  • [43] Clustering-based iterative heuristic framework for a non-emergency patients transportation problem
    Nasir, Jamal Abdul
    Kuo, Yong-Hong
    Cheng, Reynold
    JOURNAL OF TRANSPORT & HEALTH, 2022, 26
  • [44] A novel topic modeling based weighting framework for class imbalance learning
    Santhiappan, Sudarsun
    Chelladurai, Jeshuren
    Ravindran, Balaraman
    PROCEEDINGS OF THE ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA (CODS-COMAD'18), 2018, : 20 - 29
  • [45] A fuzzy clustering-based rapid prototyping for fuzzy rule-based modeling
    Delgado, M
    GomezSkarmeta, AF
    Martin, F
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1997, 5 (02) : 223 - 233
  • [46] Class-overlap detection based on heterogeneous clustering ensemble for multi-class imbalance problem
    Dai, Qi
    Wang, Long-hui
    Xu, Kai-long
    Du, Tony
    Chen, Li -fang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [47] Benchmarking framework for class imbalance problem using novel sampling approach for big data
    Ahlawat, Khyati
    Chug, Anuradha
    Singh, Amit Prakash
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2019, 10 (04) : 824 - 835
  • [48] Benchmarking framework for class imbalance problem using novel sampling approach for big data
    Khyati Ahlawat
    Anuradha Chug
    Amit Prakash Singh
    International Journal of System Assurance Engineering and Management, 2019, 10 : 824 - 835
  • [49] Fuzzy Clustering-Based Neural Fuzzy Network with Support Vector Regression
    Juang, Chia-Feng
    Hsieh, Cheng-Da
    Hong, Jyun-Lang
    ICIEA 2010: PROCEEDINGS OF THE 5TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOL 2, 2010, : 3 - 8
  • [50] Reinforced Fuzzy Clustering-Based Ensemble Neural Networks
    Kim, Eun-Hu
    Oh, Sung-Kwun
    Pedrycz, Witold
    Fu, Zunwei
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (03) : 569 - 582