A No Parameter Synthetic Minority Oversampling Technique Based on Finch for Imbalanced Data

被引:1
|
作者
Xu, Shoukun [1 ]
Li, Zhibang [1 ]
Yuan, Baohua [1 ]
Yang, Gaochao [1 ]
Wang, Xueyuan [1 ]
Li, Ning [1 ]
机构
[1] Changzhou Univ, Coll Comp & Artificial Intelligence, Changzhou 213164, Jiangsu, Peoples R China
关键词
SMOTE; FINCH algorithm; Synthesis strategy; SAMPLING METHOD; SMOTE;
D O I
10.1007/978-981-99-4752-2_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The synthetic minority oversampling technique(SMOTE) has emerged as a significant approach to address class imbalance challenges in machine learning. However, the algorithm is afflicted by challenges such as the imbalanced distribution of minority class data and concerns regarding the quality of synthetic data. The enhanced variants combined with the clustering algorithm encounter the problems such as difficulty in determining the optimal value of hyperparameters and class overlap. So this paper proposes a new improved algorithm named NP-SMOTE. The core concept of the algorithm is as follows: initially, the FINCH algorithm is employed to cluster the minority class data into distinct clusters. Subsequently, the data within each cluster are categorized into boundary data and central data by determining the class of nearest neighbors for each minority class data. Finally, the appropriate synthesis methods are applied to generate data for these two classes of minority class data. This algorithm obviates the need for predetermined hyperparameters and circumvents the limitations of class overlap by synthesizing data from various classes in a customized manner. The algorithm exhibits robustness and superior generalizability as demonstrated by their comparison with commonly used algorithms across 6 datasets.
引用
收藏
页码:367 / 378
页数:12
相关论文
共 50 条
  • [1] An improved and random synthetic minority oversampling technique for imbalanced data
    Wei, Guoliang
    Mu, Weimeng
    Song, Yan
    Dou, Jun
    KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [2] A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Murase, Kazuyuki
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 735 - +
  • [3] Performance of Synthetic Minority Oversampling Technique on Imbalanced Breast Cancer Data
    Rani, K. Usha
    Ramadevi, G. Naga
    Lavanya, D.
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 1623 - 1627
  • [4] Clustering-based improved adaptive synthetic minority oversampling technique for imbalanced data classification
    Jin, Dian
    Xie, Dehong
    Liu, Di
    Gong, Murong
    INTELLIGENT DATA ANALYSIS, 2023, 27 (03) : 635 - 652
  • [5] A Synthetic Minority Oversampling Technique Based on Gaussian Mixture Model Filtering for Imbalanced Data Classification
    Xu, Zhaozhao
    Shen, Derong
    Kou, Yue
    Nie, Tiezheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3740 - 3753
  • [6] An extension of Synthetic Minority Oversampling Technique based on Kalman filter for imbalanced datasets
    Thejas, G. S.
    Hariprasad, Yashas
    Iyengar, S. S.
    Sunitha, N. R.
    Badrinath, Prajwal
    Chennupati, Shasank
    MACHINE LEARNING WITH APPLICATIONS, 2022, 8
  • [7] A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique
    Mustafa, Nadir
    Memon, Raheel A.
    Li, Jian-Ping
    Omer, Mohammed Z.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2017, 8 (01) : 61 - 67
  • [8] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Liu, Ruijuan
    APPLIED INTELLIGENCE, 2023, 53 (01) : 786 - 803
  • [9] LSMOTE: A link-based Synthetic Minority Oversampling Technique for binary imbalanced datasets
    Cai, Qin-Nan
    Zhang, Zhong-Liang
    Wu, Yu-Heng
    Zhang, Xiu-Ming
    NEUROCOMPUTING, 2024, 608
  • [10] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Ruijuan Liu
    Applied Intelligence, 2023, 53 : 786 - 803