BCGAN: A CGAN-based over-sampling model using the boundary class for data balancing

被引:10
|
作者
Son, Minjae [1 ]
Jung, Seungwon [2 ]
Jung, Seungmin [2 ]
Hwang, Eenjun [2 ]
机构
[1] Kyowon, Seoul 04539, South Korea
[2] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
来源
JOURNAL OF SUPERCOMPUTING | 2021年 / 77卷 / 09期
基金
新加坡国家研究基金会;
关键词
Imbalanced data; Conditional generative adversarial network (CGAN); Borderline minority class; Over-sampling; MACHINE; SVM; CLASSIFICATION; PREDICTION; ACCURACY; SMOTE;
D O I
10.1007/s11227-021-03688-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A class imbalance problem occurs when a dataset is decomposed into one majority class and one minority class. This problem is critical in the machine learning domains because it induces bias in training machine learning models. One popular method to solve this problem is using a sampling technique to balance the class distribution by either under-sampling the majority class or over-sampling the minority class. So far, diverse over-sampling techniques have suffered from overfitting and noisy data generation problems. In this paper, we propose an over-sampling scheme based on the borderline class and conditional generative adversarial network (CGAN). More specifically, we define a borderline class based on the minority class data near the majority class. Then, we generate data for the borderline class using the CGAN for data balancing. To demonstrate the performance of the proposed scheme, we conducted various experiments on diverse imbalanced datasets. We report some of the results.
引用
收藏
页码:10463 / 10487
页数:25
相关论文
共 50 条
  • [21] A Novel Borderline Over-Sampling Method Based on KNN and Deep Gaussian Mixture Model for Imbalanced Data
    Zhang H.
    Xiao H.
    Yi C.
    Yuan R.
    Data Analysis and Knowledge Discovery, 2023, 7 (05) : 116 - 122
  • [22] PDFOS: PDF estimation based over-sampling for imbalanced two-class problems
    Gao, Ming
    Hong, Xia
    Chen, Sheng
    Harris, Chris J.
    Khalaf, Emad
    NEUROCOMPUTING, 2014, 138 : 248 - 259
  • [23] A Novel Cluster based Over-sampling Approach for Classifying Imbalanced Sentiment Data
    Chang, Jing-Rong
    Chen, Long-Sheng
    Lin, Li-Wei
    IAENG International Journal of Computer Science, 2021, 48 (04):
  • [24] Imbalanced data classification using improved synthetic minority over-sampling technique
    Anusha, Yamijala
    Visalakshi, R.
    Srinivas, Konda
    MULTIAGENT AND GRID SYSTEMS, 2023, 19 (02) : 117 - 131
  • [25] Feature selection and its combination with data over-sampling for multi-class imbalanced datasets
    Tsai, Chih-Fong
    Chen, Kuan-Chen
    Lin, Wei -Chao
    APPLIED SOFT COMPUTING, 2024, 153
  • [26] A Normal Distribution-Based Over-Sampling Approach to Imbalanced Data Classification
    Zhang, Huaxiang
    Wang, Zhichao
    ADVANCED DATA MINING AND APPLICATIONS, PT I, 2011, 7120 : 83 - 96
  • [27] A Proposal of Blind Identification Method Based on Over-Sampling for AR-model
    Tajima, Shinya
    Matsumoto, Hiroki
    2013 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS), 2013, : 146 - 151
  • [28] Multi-fidelity model based on synthetic minority over-sampling technique
    Jiuxiang Song
    Jizhong Liu
    Multimedia Tools and Applications, 2024, 83 : 33123 - 33139
  • [29] Blind inverse problem for transfer function model based on output over-sampling
    Sun, LM
    Liu, WJ
    Sano, A
    (SYSID'97): SYSTEM IDENTIFICATION, VOLS 1-3, 1998, : 489 - 494
  • [30] Multi-fidelity model based on synthetic minority over-sampling technique
    Song, Jiuxiang
    Liu, Jizhong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 33123 - 33139