Improving the Performance of Batch-Constrained Reinforcement Learning in Continuous Action Domains via Generative Adversarial Networks

被引：0

作者：

Saglam, Baturay ^{[1
]}

Dalmaz, Onat ^{[1
]}

Gonc, Kaan ^{[2
]}

Kozat, Suleyman S. ^{[1
]}

机构：

[1] Bilkent Univ, Elekt & Elekt Muhendisligi Bolumu, Ankara, Turkey

[2] Bilkent Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey

来源：

2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU | 2022年

关键词：

deep reinforcement learning; batch-constrained reinforcement learning; offline reinforcement learning;

D O I：

10.1109/SIU55565.2022.9864786

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The Batch-Constrained Q-learning algorithm is shown to overcome the extrapolation error and enable deep reinforcement learning agents to learn from a previously collected fixed batch of transitions. However, due to conditional Variational Autoencoders (VAE) used in the data generation module, the BCQ algorithm optimizes a lower variational bound and hence, it is not generalizable to environments with large state and action spaces. In this paper, we show that the performance of the BCQ algorithm can be further improved with the employment of one of the recent advances in deep learning, Generative Adversarial Networks. Our extensive set of experiments shows that the introduced approach significantly improves BCQ in all of the control tasks tested. Moreover, the introduced approach demonstrates robust generalizability to environments with large state and action spaces in the OpenAI Gym control suite.

引用

页数：4

共 50 条

[41] Intrusion Detection Based on Generative Adversarial Network of Reinforcement Learning Strategy for Wireless Sensor Networks
Tu J.
Ogola W.
Xu D.
Xie W.
International Journal of Circuits, Systems and Signal Processing, 2022, 16 : 478 - 482
[42] Improving Prediction Accuracy in Building Performance Models Using Generative Adversarial Networks (GANs)
Chokwitthaya, Chanachok
Collier, Edward
Zhu, Yimin
Mukhopadhyay, Supratik
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[43] On improving the performance of glitch classification for gravitational wave detection by using Generative Adversarial Networks
Yan, Jianqi
Leung, Alex P.
Hui, C. Y.
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2022, 515 (03) : 4606 - 4621
[44] Reinforcement bond performance in 3D concrete printing: Explainable ensemble learning augmented by deep generative adversarial networks
Wang, Xianlin
Banthia, Nemkumar
Yoo, Doo-Yeol
AUTOMATION IN CONSTRUCTION, 2024, 158
[45] Orthogonal Adversarial Deep Reinforcement Learning for Discrete- and Continuous-Action Problems
Ohashi, Kohei
Nakanishi, Kosuke
Goto, Nao
Yasui, Yuji
Ishii, Shin
IEEE ACCESS, 2024, 12 : 151907 - 151919
[46] Convergent Reinforcement Learning Control with Neural Networks and Continuous Action Search
Lee, Minwoo
Anderson, Charles W.
2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 33 - 40
[47] Improving Astronomical Time-series Classification via Data Augmentation with Generative Adversarial Networks
Garcia-Jara, German
Protopapas, Pavlos
Estevez, Pablo A.
ASTROPHYSICAL JOURNAL, 2022, 935 (01):
[48] Improving Reinforcement Learning control via online bilinear action interpolation
Ribeiro, CHC
Hemerly, EM
VTH BRAZILIAN SYMPOSIUM ON NEURAL NETWORKS, PROCEEDINGS, 1998, : 102 - 105
[49] Machinery Health Monitoring Based on Unsupervised Feature Learning via Generative Adversarial Networks
Dai, Jun
Wang, Jun
Huang, Weiguo
Shi, Juanjuan
Zhu, Zhongkui
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2020, 25 (05) : 2252 - 2263
[50] Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning
Wang, Haozhe
Du, Chao
Pang, Panyan
He, Li
Wang, Liang
Zheng, Bo
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 2314 - 2325

← 1 2 3 4 5 →