BCGAN: A CGAN-based over-sampling model using the boundary class for data balancing

被引:10
|
作者
Son, Minjae [1 ]
Jung, Seungwon [2 ]
Jung, Seungmin [2 ]
Hwang, Eenjun [2 ]
机构
[1] Kyowon, Seoul 04539, South Korea
[2] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
来源
JOURNAL OF SUPERCOMPUTING | 2021年 / 77卷 / 09期
基金
新加坡国家研究基金会;
关键词
Imbalanced data; Conditional generative adversarial network (CGAN); Borderline minority class; Over-sampling; MACHINE; SVM; CLASSIFICATION; PREDICTION; ACCURACY; SMOTE;
D O I
10.1007/s11227-021-03688-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A class imbalance problem occurs when a dataset is decomposed into one majority class and one minority class. This problem is critical in the machine learning domains because it induces bias in training machine learning models. One popular method to solve this problem is using a sampling technique to balance the class distribution by either under-sampling the majority class or over-sampling the minority class. So far, diverse over-sampling techniques have suffered from overfitting and noisy data generation problems. In this paper, we propose an over-sampling scheme based on the borderline class and conditional generative adversarial network (CGAN). More specifically, we define a borderline class based on the minority class data near the majority class. Then, we generate data for the borderline class using the CGAN for data balancing. To demonstrate the performance of the proposed scheme, we conducted various experiments on diverse imbalanced datasets. We report some of the results.
引用
收藏
页码:10463 / 10487
页数:25
相关论文
共 50 条
  • [41] SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE BASED ROTATION FOREST FOR THE CLASSIFICATION OF UNBALANCED HYPERSPECTRAL DATA
    Feng, Wei
    Huang, Wenjiang
    Ye, Huichun
    Zhao, Longlong
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 2651 - 2654
  • [42] Preprocessing of Imbalanced Breast Cancer Data using Feature Selection Combined with Over-Sampling Technique for classification
    Jojan, Janjira
    Srivihok, Anongnart
    2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2013, : 407 - 412
  • [43] Tackling Class Imbalance Problem in Software Defect Prediction Through Cluster-Based Over-Sampling With Filtering
    Gong, Lina
    Jiang, Shujuan
    Jiang, Li
    IEEE ACCESS, 2019, 7 : 145725 - 145737
  • [44] Convergence Improvement for Multi-Individual Optimization based Identification Using Output Over-Sampling
    Sun, Lianming
    Sano, Akira
    IFAC PAPERSONLINE, 2023, 56 (02): : 114 - 119
  • [45] A Proposal of Blind Identification Method Based on Over-Sampling Using Orthogonal Filter for IIR system
    Tajima, Shinya
    Ogawa, Tomomi
    Matsumoto, Hiroki
    2014 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2014, : 171 - 174
  • [46] Ensemble based adaptive over-sampling method for imbalanced data learning in computer aided detection of microaneurysm
    Ren, Fulong
    Cao, Peng
    Li, Wei
    Zhao, Dazhe
    Zaiane, Osmar
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2017, 55 : 54 - 67
  • [47] RNA-binding protein sequence prediction method based on ensemble learning and data over-sampling
    Wang, Xu
    Wang, Shunfang
    2021 13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2021, : 323 - 328
  • [48] A novel ensemble over-sampling approach based Chebyshev inequality for imbalanced multi-label data
    Ren, Weishuo
    Zheng, Yifeng
    Zhang, Wenjie
    Qing, Depeng
    Zeng, Xianlong
    Li, Guohe
    NEUROCOMPUTING, 2025, 612
  • [49] Synthetic Minority Over-Sampling Technique based on Fuzzy C-means Clustering for Imbalanced Data
    Lee, Hansoo
    Jung, Seunghyan
    Kim, Minseok
    Kimt, Sungshin
    2017 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2017,
  • [50] Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction
    Kim, Myoung-Jong
    Kang, Dae-Ki
    Kim, Hong Bae
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) : 1074 - 1082