Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

被引:1
|
作者
Kanagawa, Yuji [1 ]
Kaneko, Tomoyuki [2 ]
机构
[1] Univ Tokyo, Grad Sch Arts & Sci, Tokyo, Japan
[2] Univ Tokyo, Interfac Initiat Informat Studies, Tokyo, Japan
关键词
roguelike games; reinforcement learning; generalization; domain adaptation; neural networks; ENVIRONMENT;
D O I
10.1109/cig.2019.8848075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose Rogue-Gym, a simple and classic style roguelike game built for evaluating generalization in reinforcement learning (RL). Combined with the recent progress of deep neural networks, RL has successfully trained human-level agents without human knowledge in many games such as those for Atari 2600. However, it has been pointed out that agents trained with RL methods often overfit the training environment, and they work poorly in slightly different environments. To investigate this problem, some research environments with procedural content generation have been proposed. Following these studies, we propose the use of roguelikes as a benchmark for evaluating the generalization ability of RL agents. In our Rogue-Gym, agents need to explore dungeons that are structured differently each time they start a new game. Thanks to the very diverse structures of the dungeons, we believe that the generalization benchmark of Rogue-Gym is sufficiently fair. In our experiments, we evaluate a standard reinforcement learning method, PPO, with and without enhancements for generalization. The results show that some enhancements believed to be effective fail to mitigate the overfitting in Rogue-Gym, although others slightly improve the generalization ability.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Generalization in Reinforcement Learning by Soft Data Augmentation
    Hansen, Nicklas
    Wang, Xiaolong
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13611 - 13617
  • [32] Novelty and Inductive Generalization in Human Reinforcement Learning
    Gershman, Samuel J.
    Niv, Yael
    TOPICS IN COGNITIVE SCIENCE, 2015, 7 (03) : 391 - 415
  • [33] Algebraic Reinforcement Learning Hypothesis Induction for Relational Reinforcement Learning Using Term Generalization
    Neubert, Stefanie
    Belzner, Lenz
    Wirsing, Martin
    LOGIC, REWRITING, AND CONCURRENCY, 2015, 9200 : 562 - 579
  • [34] IoT Sensor Gym: Training Autonomous IoT Devices with Deep Reinforcement Learning
    Murad, Abdulmajid
    Kraemer, Frank Alexander
    Bach, Kerstin
    Taylor, Gavin
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON THE INTERNET OF THINGS ( IOT 2019), 2019,
  • [35] Implemention of Reinforcement Learning Environment for Mobile Manipulator Using Robo-gym
    Kim, Myunghyun
    Yang, Sungwoo
    Kang, Soomin
    Kim, Wonha
    Kim, Donghan
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 292 - 295
  • [36] Virtual Commissioning Simulation as OpenAI Gym - A Reinforcement Learning Environment for Control Systems
    Jaensch, Florian
    Klingel, Lars
    Verl, Alexander
    2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 64 - 67
  • [37] Parallel Gym Gazebo: a Scalable Parallel Robot Deep Reinforcement Learning Platform
    Liang, Zhen
    Cai, Zhongxuan
    Li, Minglong
    Yang, Wenjing
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 206 - 213
  • [38] Multi-Agent Reinforcement Learning for Multiple Rogue Drone Interception
    Valianti, Panayiota
    Malialis, Kleanthis
    Kolios, Panayiotis
    Ellinas, Georgios
    2023 International Conference on Unmanned Aircraft Systems, ICUAS 2023, 2023, : 1037 - 1044
  • [39] Using Predictive Representations to Improve Generalization in Reinforcement Learning
    Rafols, Eddie J.
    Ring, Mark B.
    Sutton, Richard S.
    Tanner, Brian
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 835 - 840
  • [40] Experience generalization for multi-agent reinforcement learning
    Pegoraro, R
    Costa, AHR
    Ribeiro, CHC
    SCCC 2001: XXI INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2001, : 233 - 239