BCGAN: A CGAN-based over-sampling model using the boundary class for data balancing

被引:10
|
作者
Son, Minjae [1 ]
Jung, Seungwon [2 ]
Jung, Seungmin [2 ]
Hwang, Eenjun [2 ]
机构
[1] Kyowon, Seoul 04539, South Korea
[2] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
来源
JOURNAL OF SUPERCOMPUTING | 2021年 / 77卷 / 09期
基金
新加坡国家研究基金会;
关键词
Imbalanced data; Conditional generative adversarial network (CGAN); Borderline minority class; Over-sampling; MACHINE; SVM; CLASSIFICATION; PREDICTION; ACCURACY; SMOTE;
D O I
10.1007/s11227-021-03688-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A class imbalance problem occurs when a dataset is decomposed into one majority class and one minority class. This problem is critical in the machine learning domains because it induces bias in training machine learning models. One popular method to solve this problem is using a sampling technique to balance the class distribution by either under-sampling the majority class or over-sampling the minority class. So far, diverse over-sampling techniques have suffered from overfitting and noisy data generation problems. In this paper, we propose an over-sampling scheme based on the borderline class and conditional generative adversarial network (CGAN). More specifically, we define a borderline class based on the minority class data near the majority class. Then, we generate data for the borderline class using the CGAN for data balancing. To demonstrate the performance of the proposed scheme, we conducted various experiments on diverse imbalanced datasets. We report some of the results.
引用
收藏
页码:10463 / 10487
页数:25
相关论文
共 50 条
  • [31] Handling Autism Imbalanced Data using Synthetic Minority Over-Sampling Technique (SMOTE)
    El-Sayed, Asmaa Ahmed
    Meguid, Nagwa Abdel
    Mahmood, Mahmood Abdel Manem
    Hefny, Hesham Ahmed
    PROCEEDINGS OF 2015 THIRD IEEE WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS), 2015,
  • [32] Probability Density Function Estimation Based Over-Sampling for Imbalanced Two-Class Problems
    Gao, Ming
    Hong, Xia
    Chen, Sheng
    Harris, Chris J.
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [33] Understanding the apparent superiority of over-sampling through an analysis of local information for class-imbalanced data
    Garcia, V
    Sanchez, J. S.
    Marques, A., I
    Florencia, R.
    Rivera, G.
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 158
  • [34] Over-Sampling Emotional Speech Data Based on Subjective Evaluations Provided by Multiple Individuals
    Lotfian, Reza
    Busso, Carlos
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2021, 12 (04) : 870 - 882
  • [35] An adaptive over-sampling method for imbalanced data based on simultaneous clustering and filtering noisy
    Chen, Wei
    Guo, Wenjie
    Mao, Weijie
    APPLIED INTELLIGENCE, 2024, 54 (22) : 11430 - 11449
  • [36] Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods
    Ijaz, Muhammad Fazal
    Attique, Muhammad
    Son, Youngdoo
    SENSORS, 2020, 20 (10)
  • [37] Model validation of plant in closed-loop based on output over-sampling scheme
    Sun, Lianming
    Sano, Akira
    ICIC Express Letters, 2015, 9 (12): : 3179 - 3185
  • [38] Software defect prediction using over-sampling and feature extraction based on Mahalanobis distance
    Mohammad Mahdi NezhadShokouhi
    Mohammad Ali Majidi
    Abbas Rasoolzadegan
    The Journal of Supercomputing, 2020, 76 : 602 - 635
  • [39] Improving Diagnostic Performance of a Power Transformer Using an Adaptive Over-Sampling Method for Imbalanced Data
    Tra, Viet
    Bach-Phi Duong
    Kim, Jong-Myon
    IEEE TRANSACTIONS ON DIELECTRICS AND ELECTRICAL INSULATION, 2019, 26 (04) : 1325 - 1333
  • [40] Software defect prediction using over-sampling and feature extraction based on Mahalanobis distance
    NezhadShokouhi, Mohammad Mahdi
    Majidi, Mohammad Ali
    Rasoolzadegan, Abbas
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (01): : 602 - 635