PolicyGAN: Training generative adversarial networks using policy gradient

被引:0
|
作者
Paria, Biswajit [1 ]
Lahiri, Avisek [2 ]
Biswas, Prabir Kumar [2 ]
机构
[1] IIT, Dept CSE, Kharagpur, W Bengal, India
[2] IIT, Dept E&ECE, Kharagpur, W Bengal, India
关键词
Generative Adversarial Networks; Reinforcement Learning; Policy Gradient; Inception Score; Adversarial Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents PolicyGAN, a policy gradient paradigm for training Generative Adversarial Networks that views the generator as an image generation neural agent which is rewarded by another neural agent, termed as the discriminator. Rewards are higher for samples near the original data manifold. In PolicyGAN, only reward signal from the output of the discriminator is used for updating the generator network using policy gradient. This obviates the need for gradient signal to flow through the discriminator for training the generator; an intrinsic property of original GAN formulation. Given the inherent difficulty of training adversarial models, and low convergence speed of policy gradient, training GANs using policy gradient is a non-trivial problem and requires deep study. Currently GANs have used only differentiable discriminators for training. Policy-GAN germinates the possibility of using a wide variety of non-differentiable discriminator networks for training GANs, something which was not possible with the original GAN framework. Another advantage of using policy gradient is that now the generator need not produce deterministic samples, but can generate a probability distribution from which samples can be taken. PolicyGAN thus paves the path to use a variety of probabilistic models.
引用
收藏
页码:151 / 156
页数:6
相关论文
共 50 条
  • [21] Surgical Tool Segmentation Using Generative Adversarial Networks With Unpaired Training Data
    Zhang, Zhongkai
    Rosa, Benoit
    Nageotte, Florent
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) : 6266 - 6273
  • [22] Behavioral Repertoire via Generative Adversarial Policy Networks
    Jegorova, Marija
    Doncieux, Stephane
    Hospedales, Timothy M.
    [J]. 2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 320 - 326
  • [23] Adaptive Weighted Discriminator for Training Generative Adversarial Networks
    Zadorozhnyy, Vasily
    Cheng, Qiang
    Ye, Qiang
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4779 - 4788
  • [24] A Survey on Generative Adversarial Networks: Variants, Applications, and Training
    Jabbar, Abdul
    Li, Xi
    Omar, Bourahla
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (08)
  • [25] Cloning and training collective intelligence with generative adversarial networks
    Terziyan, Vagan
    Gavriushenko, Mariia
    Girka, Anastasiia
    Gontarenko, Andrii
    Kaikova, Olena
    [J]. IET COLLABORATIVE INTELLIGENT MANUFACTURING, 2021, 3 (01) : 64 - 74
  • [26] Posit Arithmetic for the Training and Deployment of Generative Adversarial Networks
    Nhut-Minh Ho
    Duy-Thanh Nguyen
    De Silva, Himeshi
    Gustafson, John L.
    Wong, Weng-Fai
    Chang, Ik Joon
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1350 - 1355
  • [27] Training and Validation of Automatic Target Recognition Systems using Generative Adversarial Networks
    Karjalainen, Antti Ilari
    Mitchell, Roshenac
    Vazquez, Jose
    [J]. 2019 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE (SSPD), 2019,
  • [28] Stabilized Training of Generative Adversarial Networks by a Genetic Algorithm
    Cho, Hwi-Yeon
    Kim, Yong-Hyuk
    [J]. PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 51 - 52
  • [29] Stabilizing Training of Generative Adversarial Networks through Regularization
    Roth, Kevin
    Lucchi, Aurelien
    Nowozin, Sebastian
    Hofmann, Thomas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [30] GenCo: Generative Co-training for Generative Adversarial Networks with Limited Data
    Cui, Kaiwen
    Huang, Jiaxing
    Luo, Zhipeng
    Zhang, Gongjie
    Zhan, Fangneng
    Lu, Shijian
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 499 - 507