An improved generative adversarial network to oversample imbalanced datasets

被引:2
|
作者
Pan, Tingting [1 ]
Pedrycz, Witold [2 ]
Yang, Jie [3 ,4 ]
Wang, Jian [5 ]
机构
[1] Dalian Polytech Univ, Dept Basic Courses Teaching, Dalian 116034, Peoples R China
[2] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2G7, Canada
[3] Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
[4] Key Lab Computat Math & Data Intelligence Liaoning, Dalian 116024, Peoples R China
[5] China Univ Petr East China, Coll Sci, Qingdao 266580, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Imbalanced learning; Generative adversarial network (GAN); Oversampling; Probability distribution; CLASSIFICATION; CLASSIFIERS; SMOTE; GAN;
D O I
10.1016/j.engappai.2024.107934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many oversampling methods applied to imbalanced data generate samples according to local density distribution of minority samples. However, samples generated by these methods can only present a non -deterministic relationship between the local and global distributions. A generative adversarial network (GAN) is a suitable tool to learn an unknown global probability distribution. In this paper, we propose an improved GAN (I-GAN) to oversample according to the global underlying structure of minority samples. The originality of I-GAN stems from the fact it provides additional density distribution information of minority samples for GAN and generated samples. By building on this idea, three detailed strategies are presented: input random vectors of the generator are sampled from a rough estimate of the distribution of minority samples to orientate fake samples more believable; a residual about minority samples is added into the discriminator to strengthen the constraint of loss function; generated samples are redistributed with a reshaper. These three strategies provide innovative methodologies at various stages of GANs for the oversampling task. Compared with 22 classical and popular imbalanced sampling methods under metrics of Gm, F1, and AUC on 24 benchmark imbalanced datasets, it is shown that I-GAN is effective and robust. The I-GAN implementation line procedure has been uploaded to Github (https://github.com/flowerbloom000/I-GAN).
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A novel generative adversarial network for improving crash severity modeling with imbalanced data
    Chen, Junlan
    Pu, Ziyuan
    Zheng, Nan
    Wen, Xiao
    Ding, Hongliang
    Guo, Xiucheng
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 164
  • [22] Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation
    Mina Rezaei
    Haojin Yang
    Christoph Meinel
    [J]. Multimedia Tools and Applications, 2020, 79 : 15329 - 15348
  • [23] A dynamic spectrum loss generative adversarial network for intelligent fault with imbalanced data
    Wang, Xin
    Jiang, Hongkai
    Liu, Yunpeng
    Liu, Shaowei
    Yang, Qiao
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [24] Towards Imbalanced Image Classification: A Generative Adversarial Network Ensemble Learning Method
    Huang, Yangru
    Jin, Yi
    Li, Yidong
    Lin, Zhiping
    [J]. IEEE ACCESS, 2020, 8 : 88399 - 88409
  • [25] Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation
    Rezaei, Mina
    Yang, Haojin
    Meinel, Christoph
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 15329 - 15348
  • [26] Fault Diagnosis of Harmonic Drive With Imbalanced Data Using Generative Adversarial Network
    Yang, Guo
    Zhong, Yong
    Yang, Lie
    Tao, Hui
    Li, Jianying
    Du, Ruxu
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [27] Improved Generative Adversarial Network for Image Scene Transformation
    面向图像场景转换的改进型生成对抗网络
    [J]. Xiao, Jin-Sheng (xiaojs@whu.edu.cn), 1600, Chinese Academy of Sciences (32): : 2755 - 2768
  • [28] Jujube quality grading using a generative adversarial network with an imbalanced data set
    Cang, Hao
    Yan, Tianying
    Duan, Long
    Yan, Jingkun
    Zhang, Yuan
    Tan, Fei
    Lv, Xin
    Gao, Pan
    [J]. BIOSYSTEMS ENGINEERING, 2023, 236 : 224 - 237
  • [29] Enhanced detection of imbalanced malicious network traffic with regularized Generative Adversarial Networks
    Chapaneri, Radhika
    Shah, Seema
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2022, 202
  • [30] Detracking Autoencoding Conditional Generative Adversarial Network: Improved Generative Adversarial Network Method for Tabular Missing Value Imputation
    Liu, Jingrui
    Duan, Zixin
    Hu, Xinkai
    Zhong, Jingxuan
    Yin, Yunfei
    [J]. ENTROPY, 2024, 26 (05)