Improving the Performance of Batch-Constrained Reinforcement Learning in Continuous Action Domains via Generative Adversarial Networks

被引：0

作者：

Saglam, Baturay ^{[1
]}

Dalmaz, Onat ^{[1
]}

Gonc, Kaan ^{[2
]}

Kozat, Suleyman S. ^{[1
]}

机构：

[1] Bilkent Univ, Elekt & Elekt Muhendisligi Bolumu, Ankara, Turkey

[2] Bilkent Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey

来源：

2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU | 2022年

关键词：

deep reinforcement learning; batch-constrained reinforcement learning; offline reinforcement learning;

D O I：

10.1109/SIU55565.2022.9864786

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The Batch-Constrained Q-learning algorithm is shown to overcome the extrapolation error and enable deep reinforcement learning agents to learn from a previously collected fixed batch of transitions. However, due to conditional Variational Autoencoders (VAE) used in the data generation module, the BCQ algorithm optimizes a lower variational bound and hence, it is not generalizable to environments with large state and action spaces. In this paper, we show that the performance of the BCQ algorithm can be further improved with the employment of one of the recent advances in deep learning, Generative Adversarial Networks. Our extensive set of experiments shows that the introduced approach significantly improves BCQ in all of the control tasks tested. Moreover, the introduced approach demonstrates robust generalizability to environments with large state and action spaces in the OpenAI Gym control suite.

引用

页数：4

共 50 条

[1] Safe batch constrained deep reinforcement learning with generative adversarial network
Dong, Wenbo
Liu, Shaofan
Sun, Shiliang
INFORMATION SCIENCES, 2023, 634 : 259 - 270
[2] Batch-Constrained Reinforcement Learning for Dynamic Distribution Network Reconfiguration
Gao, Yuanqi
Wang, Wei
Shi, Jie
Yu, Nanpeng
IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (06) : 5357 - 5369
[3] Improving Generative Adversarial Networks via Adversarial Learning in Latent Space
Li, Yang
Mo, Yichuan
Shi, Liangliang
Yan, Junchi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[4] Continuous Doubly Constrained Batch Reinforcement Learning
Fakoor, Rasool
Mueller, Jonas
Asadi, Kavosh
Chaudhari, Pratik
Smola, Alexander J.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[5] Self-Improving Generative Adversarial Reinforcement Learning
Liu, Yang
Zeng, Yifeng
Chen, Yingke
Tang, Jing
Pan, Yinghui
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 52 - 60
[6] A latent batch-constrained deep reinforcement learning approach for precision dosing clinical decision support
Qiu, Xihe
Tan, Xiaoyu
Li, Qiong
Chen, Shaotao
Ru, Yajun
Jin, Yaochu
KNOWLEDGE-BASED SYSTEMS, 2022, 237
[7] Improving Generative Adversarial Networks with Adaptive Control Learning
Ma, Xiaohan
Jin, Rize
Sohn, Kyung-Ah
Paik, JoonYoung
Sun, Jing
Chung, Tae-Sun
2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
[8] Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks
Huh, Minyoung
Sun, Shao-Hua
Zhang, Ning
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1476 - 1485
[9] OFFLINE REINFORCEMENT LEARNING WITH GENERATIVE ADVERSARIAL NETWORKS AND UNCERTAINTY ESTIMATION
Wu, Lan
Liu, Quan
Zhang, Lihua
Huang, Zhigang
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5255 - 5259
[10] Machine Learning in NextG Networks via Generative Adversarial Networks
Ayanoglu, Ender
Davaslioglu, Kemal
Sagduyu, Yalin E.
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (02) : 480 - 501

← 1 2 3 4 5 →