A No Parameter Synthetic Minority Oversampling Technique Based on Finch for Imbalanced Data

被引:1
|
作者
Xu, Shoukun [1 ]
Li, Zhibang [1 ]
Yuan, Baohua [1 ]
Yang, Gaochao [1 ]
Wang, Xueyuan [1 ]
Li, Ning [1 ]
机构
[1] Changzhou Univ, Coll Comp & Artificial Intelligence, Changzhou 213164, Jiangsu, Peoples R China
关键词
SMOTE; FINCH algorithm; Synthesis strategy; SAMPLING METHOD; SMOTE;
D O I
10.1007/978-981-99-4752-2_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The synthetic minority oversampling technique(SMOTE) has emerged as a significant approach to address class imbalance challenges in machine learning. However, the algorithm is afflicted by challenges such as the imbalanced distribution of minority class data and concerns regarding the quality of synthetic data. The enhanced variants combined with the clustering algorithm encounter the problems such as difficulty in determining the optimal value of hyperparameters and class overlap. So this paper proposes a new improved algorithm named NP-SMOTE. The core concept of the algorithm is as follows: initially, the FINCH algorithm is employed to cluster the minority class data into distinct clusters. Subsequently, the data within each cluster are categorized into boundary data and central data by determining the class of nearest neighbors for each minority class data. Finally, the appropriate synthesis methods are applied to generate data for these two classes of minority class data. This algorithm obviates the need for predetermined hyperparameters and circumvents the limitations of class overlap by synthesizing data from various classes in a customized manner. The algorithm exhibits robustness and superior generalizability as demonstrated by their comparison with commonly used algorithms across 6 datasets.
引用
收藏
页码:367 / 378
页数:12
相关论文
共 50 条
  • [31] Hybrid oversampling technique for imbalanced pattern recognition: Enhancing performance with Borderline Synthetic Minority oversampling and Generative Adversarial Networks
    Ahsan, Md Manjurul
    Raman, Shivakumar
    Liu, Yingtao
    Siddique, Zahed
    Machine Learning with Applications, 2025, 20
  • [32] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [33] Power-Law-Based Synthetic Minority Oversampling Technique on Imbalanced Serum Surface-Enhanced Raman Spectroscopy Data for Cancer Screening
    Pan, Changbin
    Peng, Kaiming
    Chen, Tong
    Chen, Guannan
    Lin, Yuxiang
    Zhang, Qiyi
    Liu, Miaomiao
    Lin, Duo
    Wang, Tingyin
    Feng, Shangyuan
    ADVANCED INTELLIGENT SYSTEMS, 2023, 5 (07)
  • [34] Distance-based arranging oversampling technique for imbalanced data
    Dai, Qi
    Liu, Jian-wei
    Zhao, Jia-Liang
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02): : 1323 - 1342
  • [35] Distance-based arranging oversampling technique for imbalanced data
    Qi Dai
    Jian-wei Liu
    Jia-Liang Zhao
    Neural Computing and Applications, 2023, 35 : 1323 - 1342
  • [36] Machine Learning and Synthetic Minority Oversampling Techniques for Imbalanced Data: Improving Machine Failure Prediction
    Wah, Yap Bee
    Ismail, Azlan
    Azid, Nur Niswah Naslina
    Jaafar, Jafreezal
    Aziz, Izzatdin Abdul
    Hasan, Mohd Hilmi
    Zain, Jasni Mohamad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 4821 - 4841
  • [37] MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification
    Shin, Kyoham
    Han, Jongmin
    Kang, Seokho
    INFORMATION SCIENCES, 2021, 575 : 80 - 89
  • [38] Local distribution-based adaptive minority oversampling for imbalanced data classification
    Wang, Xinyue
    Xu, Jian
    Zeng, Tieyong
    Jing, Liping
    NEUROCOMPUTING, 2021, 422 : 200 - 213
  • [39] Synthetic minority oversampling in addressing imbalanced sarcasm detection in social media
    Arghasree Banerjee
    Mayukh Bhattacharjee
    Kushankur Ghosh
    Sankhadeep Chatterjee
    Multimedia Tools and Applications, 2020, 79 : 35995 - 36031
  • [40] CMO-SMOTE: Misclassification Cost Minimization Oriented Synthetic Minority Oversampling Technique for Imbalanced Learning
    Zhou, Changsheng
    Liu, Bin
    Wang, Shihai
    2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2, 2016, : 353 - 358