Improving the Performance of Batch-Constrained Reinforcement Learning in Continuous Action Domains via Generative Adversarial Networks

被引:0
|
作者
Saglam, Baturay [1 ]
Dalmaz, Onat [1 ]
Gonc, Kaan [2 ]
Kozat, Suleyman S. [1 ]
机构
[1] Bilkent Univ, Elekt & Elekt Muhendisligi Bolumu, Ankara, Turkey
[2] Bilkent Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
来源
2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU | 2022年
关键词
deep reinforcement learning; batch-constrained reinforcement learning; offline reinforcement learning;
D O I
10.1109/SIU55565.2022.9864786
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Batch-Constrained Q-learning algorithm is shown to overcome the extrapolation error and enable deep reinforcement learning agents to learn from a previously collected fixed batch of transitions. However, due to conditional Variational Autoencoders (VAE) used in the data generation module, the BCQ algorithm optimizes a lower variational bound and hence, it is not generalizable to environments with large state and action spaces. In this paper, we show that the performance of the BCQ algorithm can be further improved with the employment of one of the recent advances in deep learning, Generative Adversarial Networks. Our extensive set of experiments shows that the introduced approach significantly improves BCQ in all of the control tasks tested. Moreover, the introduced approach demonstrates robust generalizability to environments with large state and action spaces in the OpenAI Gym control suite.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Improving Generalization in Reinforcement Learning-Based Trading by Using a Generative Adversarial Market Model
    Kuo, Chia-Hsuan
    Chen, Chiao-Ting
    Lin, Sin-Jing
    Huang, Szu-Hao
    IEEE ACCESS, 2021, 9 : 50738 - 50754
  • [32] Improving Learning & Reducing Time: A Constrained Action-Based Reinforcement Learning Approach
    Shen, Shitian
    Ausin, Markel Sanz
    Mostafavi, Behrooz
    Chi, Min
    PROCEEDINGS OF THE 26TH CONFERENCE ON USER MODELING, ADAPTATION AND PERSONALIZATION (UMAP'18), 2018, : 43 - 51
  • [33] Personalized Learning Path Generation in E-Learning Systems using Reinforcement Learning and Generative Adversarial Networks
    Sarkar, Subharag
    Huber, Manfred
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 92 - 99
  • [34] A Reinforcement Learning Method for Continuous Domains Using Artificial Hydrocarbon Networks
    Ponce, Hiram
    Gonzalez-Mora, Guillermo
    Martinez-Villasenor, Lourdes
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 398 - 403
  • [35] Object detection via learning occluded features based on generative adversarial networks
    An S.
    Lin S.-K.
    Qiao J.-Z.
    Li C.-H.
    Kongzhi yu Juece/Control and Decision, 2021, 36 (05): : 1199 - 1205
  • [36] Learning-Based WiFi Fingerprint Inpainting via Generative Adversarial Networks
    Chan, Yu
    Lin, Pin-Yu
    Tseng, Yu-Yun
    Chen, Jen-Jee
    Tseng, Yu-Chee
    2024 33RD INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS, ICCCN 2024, 2024,
  • [37] PerFED-GAN: Personalized Federated Learning via Generative Adversarial Networks
    Cao, Xingjian
    Sun, Gang
    Yu, Hongfang
    Guizani, Mohsen
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (05): : 3749 - 3762
  • [38] Active learning using Generative Adversarial Networks for improving generalization and avoiding distractor points
    Lim, Heechul
    Chon, Kang-Wook
    Kim, Min-Soo
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
  • [39] Grasping Unknown Objects by Coupling Deep Reinforcement Learning, Generative Adversarial Networks, and Visual Servoing
    Pedersen, Ole-Magnus
    Misimi, Ekrem
    Chaumette, Francois
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 5655 - 5662
  • [40] Constrained continuous-action reinforcement learning for supply chain inventory management
    Burtea, Radu
    Tsay, Calvin
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 181