Improving the Performance of Batch-Constrained Reinforcement Learning in Continuous Action Domains via Generative Adversarial Networks

被引:0
|
作者
Saglam, Baturay [1 ]
Dalmaz, Onat [1 ]
Gonc, Kaan [2 ]
Kozat, Suleyman S. [1 ]
机构
[1] Bilkent Univ, Elekt & Elekt Muhendisligi Bolumu, Ankara, Turkey
[2] Bilkent Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
关键词
deep reinforcement learning; batch-constrained reinforcement learning; offline reinforcement learning;
D O I
10.1109/SIU55565.2022.9864786
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Batch-Constrained Q-learning algorithm is shown to overcome the extrapolation error and enable deep reinforcement learning agents to learn from a previously collected fixed batch of transitions. However, due to conditional Variational Autoencoders (VAE) used in the data generation module, the BCQ algorithm optimizes a lower variational bound and hence, it is not generalizable to environments with large state and action spaces. In this paper, we show that the performance of the BCQ algorithm can be further improved with the employment of one of the recent advances in deep learning, Generative Adversarial Networks. Our extensive set of experiments shows that the introduced approach significantly improves BCQ in all of the control tasks tested. Moreover, the introduced approach demonstrates robust generalizability to environments with large state and action spaces in the OpenAI Gym control suite.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Safe batch constrained deep reinforcement learning with generative adversarial network
    Dong, Wenbo
    Liu, Shaofan
    Sun, Shiliang
    INFORMATION SCIENCES, 2023, 634 : 259 - 270
  • [2] Batch-Constrained Reinforcement Learning for Dynamic Distribution Network Reconfiguration
    Gao, Yuanqi
    Wang, Wei
    Shi, Jie
    Yu, Nanpeng
    IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (06) : 5357 - 5369
  • [3] Improving Generative Adversarial Networks via Adversarial Learning in Latent Space
    Li, Yang
    Mo, Yichuan
    Shi, Liangliang
    Yan, Junchi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [4] Continuous Doubly Constrained Batch Reinforcement Learning
    Fakoor, Rasool
    Mueller, Jonas
    Asadi, Kavosh
    Chaudhari, Pratik
    Smola, Alexander J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Self-Improving Generative Adversarial Reinforcement Learning
    Liu, Yang
    Zeng, Yifeng
    Chen, Yingke
    Tang, Jing
    Pan, Yinghui
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 52 - 60
  • [6] A latent batch-constrained deep reinforcement learning approach for precision dosing clinical decision support
    Qiu, Xihe
    Tan, Xiaoyu
    Li, Qiong
    Chen, Shaotao
    Ru, Yajun
    Jin, Yaochu
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [7] Improving Generative Adversarial Networks with Adaptive Control Learning
    Ma, Xiaohan
    Jin, Rize
    Sohn, Kyung-Ah
    Paik, JoonYoung
    Sun, Jing
    Chung, Tae-Sun
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [8] Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks
    Huh, Minyoung
    Sun, Shao-Hua
    Zhang, Ning
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1476 - 1485
  • [9] OFFLINE REINFORCEMENT LEARNING WITH GENERATIVE ADVERSARIAL NETWORKS AND UNCERTAINTY ESTIMATION
    Wu, Lan
    Liu, Quan
    Zhang, Lihua
    Huang, Zhigang
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5255 - 5259
  • [10] Machine Learning in NextG Networks via Generative Adversarial Networks
    Ayanoglu, Ender
    Davaslioglu, Kemal
    Sagduyu, Yalin E.
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (02) : 480 - 501