PolicyGAN: Training generative adversarial networks using policy gradient

被引：0

作者：

Paria, Biswajit ^{[1
]}

Lahiri, Avisek ^{[2
]}

Biswas, Prabir Kumar ^{[2
]}

机构：

[1] IIT, Dept CSE, Kharagpur, W Bengal, India

[2] IIT, Dept E&ECE, Kharagpur, W Bengal, India

来源：

2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR) | 2017年

关键词：

Generative Adversarial Networks; Reinforcement Learning; Policy Gradient; Inception Score; Adversarial Learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents PolicyGAN, a policy gradient paradigm for training Generative Adversarial Networks that views the generator as an image generation neural agent which is rewarded by another neural agent, termed as the discriminator. Rewards are higher for samples near the original data manifold. In PolicyGAN, only reward signal from the output of the discriminator is used for updating the generator network using policy gradient. This obviates the need for gradient signal to flow through the discriminator for training the generator; an intrinsic property of original GAN formulation. Given the inherent difficulty of training adversarial models, and low convergence speed of policy gradient, training GANs using policy gradient is a non-trivial problem and requires deep study. Currently GANs have used only differentiable discriminators for training. Policy-GAN germinates the possibility of using a wide variety of non-differentiable discriminator networks for training GANs, something which was not possible with the original GAN framework. Another advantage of using policy gradient is that now the generator need not produce deterministic samples, but can generate a probability distribution from which samples can be taken. PolicyGAN thus paves the path to use a variety of probabilistic models.

引用

页码：151 / 156

页数：6

共 50 条

[21] Surgical Tool Segmentation Using Generative Adversarial Networks With Unpaired Training Data
Zhang, Zhongkai
Rosa, Benoit
Nageotte, Florent
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) : 6266 - 6273
[22] Behavioral Repertoire via Generative Adversarial Policy Networks
Jegorova, Marija
Doncieux, Stephane
Hospedales, Timothy M.
[J]. 2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 320 - 326
[23] Adaptive Weighted Discriminator for Training Generative Adversarial Networks
Zadorozhnyy, Vasily
Cheng, Qiang
Ye, Qiang
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4779 - 4788
[24] A Survey on Generative Adversarial Networks: Variants, Applications, and Training
Jabbar, Abdul
Li, Xi
Omar, Bourahla
[J]. ACM COMPUTING SURVEYS, 2021, 54 (08)
[25] Cloning and training collective intelligence with generative adversarial networks
Terziyan, Vagan
Gavriushenko, Mariia
Girka, Anastasiia
Gontarenko, Andrii
Kaikova, Olena
[J]. IET COLLABORATIVE INTELLIGENT MANUFACTURING, 2021, 3 (01) : 64 - 74
[26] Posit Arithmetic for the Training and Deployment of Generative Adversarial Networks
Nhut-Minh Ho
Duy-Thanh Nguyen
De Silva, Himeshi
Gustafson, John L.
Wong, Weng-Fai
Chang, Ik Joon
[J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1350 - 1355
[27] Training and Validation of Automatic Target Recognition Systems using Generative Adversarial Networks
Karjalainen, Antti Ilari
Mitchell, Roshenac
Vazquez, Jose
[J]. 2019 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE (SSPD), 2019,
[28] Stabilized Training of Generative Adversarial Networks by a Genetic Algorithm
Cho, Hwi-Yeon
Kim, Yong-Hyuk
[J]. PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 51 - 52
[29] Stabilizing Training of Generative Adversarial Networks through Regularization
Roth, Kevin
Lucchi, Aurelien
Nowozin, Sebastian
Hofmann, Thomas
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[30] GenCo: Generative Co-training for Generative Adversarial Networks with Limited Data
Cui, Kaiwen
Huang, Jiaxing
Luo, Zhipeng
Zhang, Gongjie
Zhan, Fangneng
Lu, Shijian
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 499 - 507

← 1 2 3 4 5 →