An improved generative adversarial network to oversample imbalanced datasets

被引:2
|
作者
Pan, Tingting [1 ]
Pedrycz, Witold [2 ]
Yang, Jie [3 ,4 ]
Wang, Jian [5 ]
机构
[1] Dalian Polytech Univ, Dept Basic Courses Teaching, Dalian 116034, Peoples R China
[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2G7, Canada
[3] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
[4] Key Lab Computat Math & Data Intelligence Liaoning, Dalian 116024, Peoples R China
[5] China Univ Petr East China, Coll Sci, Qingdao 266580, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Imbalanced learning; Generative adversarial network (GAN); Oversampling; Probability distribution; CLASSIFICATION; CLASSIFIERS; SMOTE; GAN;
D O I
10.1016/j.engappai.2024.107934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many oversampling methods applied to imbalanced data generate samples according to local density distribution of minority samples. However, samples generated by these methods can only present a non -deterministic relationship between the local and global distributions. A generative adversarial network (GAN) is a suitable tool to learn an unknown global probability distribution. In this paper, we propose an improved GAN (I-GAN) to oversample according to the global underlying structure of minority samples. The originality of I-GAN stems from the fact it provides additional density distribution information of minority samples for GAN and generated samples. By building on this idea, three detailed strategies are presented: input random vectors of the generator are sampled from a rough estimate of the distribution of minority samples to orientate fake samples more believable; a residual about minority samples is added into the discriminator to strengthen the constraint of loss function; generated samples are redistributed with a reshaper. These three strategies provide innovative methodologies at various stages of GANs for the oversampling task. Compared with 22 classical and popular imbalanced sampling methods under metrics of Gm, F1, and AUC on 24 benchmark imbalanced datasets, it is shown that I-GAN is effective and robust. The I-GAN implementation line procedure has been uploaded to Github (https://github.com/flowerbloom000/I-GAN).
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Gradually Generative Adversarial Networks Method for Imbalanced Datasets
    Misdram, Muhammad
    Muljono
    Purwanto
    Noersasongko, Edi
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 51 - 58
  • [2] Efficient Generative Adversarial Networks for Imbalanced Traffic Collision Datasets
    Chen, Mu-Yen
    Chiang, Hsiu-Sen
    Huang, Wei-Kai
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 19864 - 19873
  • [3] Balancing Imbalanced Datasets Using Generative Adversarial Neural Networks
    Divovic, Pavle
    Obradovic, Predrag
    Misic, Marko
    [J]. 2021 29TH TELECOMMUNICATIONS FORUM (TELFOR), 2021,
  • [4] AdaBalGAN: An Improved Generative Adversarial Network With Imbalanced Learning for Wafer Defective Pattern Recognition
    Wang, Junliang
    Yang, Zhengliang
    Zhang, Jie
    Zhang, Qihua
    Chien, Wei-Ting Kary
    [J]. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2019, 32 (03) : 310 - 319
  • [5] Improved generative adversarial network for vibration-based fault diagnosis with imbalanced data
    Zhao, Bingxi
    Yuan, Qi
    [J]. MEASUREMENT, 2021, 169
  • [6] Distribution Enhancement for Imbalanced Data with Generative Adversarial Network
    Chen, Yueqi
    Pedrycz, Witold
    Pan, Tingting
    Wang, Jian
    Yang, Jie
    [J]. ADVANCED THEORY AND SIMULATIONS, 2024,
  • [7] Imbalanced fault diagnosis of rolling bearing using a deep gradient improved generative adversarial network
    Liu, Shaowei
    Jiang, Hongkai
    Wu, Zhenghong
    Zhao, Ke
    Wang, Xin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2022, : 127 - 132
  • [8] An Imbalanced Generative Adversarial Network-Based Approach for Network Intrusion Detection in an Imbalanced Dataset
    Rao, Yamarthi Narasimha
    Babu, Kunda Suresh
    [J]. SENSORS, 2023, 23 (01)
  • [9] Local Tangent Generative Adversarial Network for Imbalanced Data Classification
    Li, Zhihao
    Yu, Zhiwen
    Yang, Kaixiang
    Shi, Yifan
    Xu, Yuhong
    Chen, C. L. Philip
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [10] Multiview Wasserstein generative adversarial network for imbalanced pearl classification
    Gao, Shuang
    Dai, Yun
    Li, Yingjie
    Liu, Kaixin
    Chen, Kun
    Liu, Yi
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (08)