Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

被引:1
|
作者
Goulart, Icaro [1 ]
Paes, Aline [1 ]
Clua, Esteban [1 ]
机构
[1] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil
关键词
Bomberman; Proximal Policy Optimization; Reinforcement Learning; LSTM; Imitation Learning;
D O I
10.1007/978-3-030-34644-7_10
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Making artificial agents that learn how to play is a long-standing goal in the area of Game AI. Recently, several successful cases have emerged driven by Reinforcement Learning (RL) and neural network-based approaches. However, in most of the cases, the results have been achieved by training directly from pixel frames with valuable computational resources. In this paper, we devise agents that learn how to play the popular game of Bomberman by relying on state representations and RL-based algorithms without looking at the pixel level. To that, we designed five vector-based state representations and implemented Bomberman on the top of the Unity game engine through the ML-agents toolkit. We enhance the ML-agents algorithms by developing an Imitation-based learner (IL) that improves its model with the Actor-Critic Proximal-Policy Optimization (PPO) method. We compared this approach with a PPO-only learner that uses either a Multi-Layer Perceptron or a Long-Short Term-Memory network (LSTM). We conducted several pieces of training and tournament experiments by making the agents play against each other. The hybrid state representation and our IL followed by PPO learning algorithm achieve the best overall quantitative results, and we also observed that their agents learn a correct Bomberman behavior.
引用
收藏
页码:121 / 133
页数:13
相关论文
共 50 条
  • [1] Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning
    Hua, Jiang
    Zeng, Liangcai
    Li, Gongfa
    Ju, Zhaojie
    [J]. Sensors (Switzerland), 2021, 21 (04): : 1 - 21
  • [2] Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning
    Hua, Jiang
    Zeng, Liangcai
    Li, Gongfa
    Ju, Zhaojie
    [J]. SENSORS, 2021, 21 (04) : 1 - 21
  • [3] Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
    Gros, Timo P.
    Hoeller, Daniel
    Hoffmann, Joerg
    Wolf, Verena
    [J]. QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2020), 2020, 12289 : 11 - 17
  • [4] A Penetration Strategy Combining Deep Reinforcement Learning and Imitation Learning
    Wang, Xiaofang
    Gu, Kunren
    [J]. Yuhang Xuebao/Journal of Astronautics, 2023, 44 (06): : 914 - 925
  • [5] Cloud Resource Scheduling With Deep Reinforcement Learning and Imitation Learning
    Guo, Wenxia
    Tian, Wenhong
    Ye, Yufei
    Xu, Lingxiao
    Wu, Kui
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (05): : 3576 - 3586
  • [6] Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid
    Veith, Eric Msp
    Logemann, Torben
    Berezin, Aleksandr
    Wellssow, Arlena
    Balduin, Stephan
    [J]. 2024 12TH WORKSHOP ON MODELING AND SIMULATION OF CYBER-PHYSICAL ENERGY SYSTEMS, MSCPES, 2024,
  • [7] Deep imitation reinforcement learning with expert demonstration data
    Yi, Menglong
    Xu, Xin
    Zeng, Yujun
    Jung, Seul
    [J]. JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1567 - 1573
  • [8] Learning How to Actively Learn: A Deep Imitation Learning Approach
    Liu, Ming
    Buntine, Wray
    Haffari, Gholamreza
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1874 - 1883
  • [9] Building Safe and Stable DNN Controllers using Deep Reinforcement Learning and Deep Imitation Learning
    He, Xudong
    [J]. 2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 775 - 784
  • [10] Deep imitation reinforcement learning for self-driving by vision
    Zou, Qijie
    Xiong, Kang
    Fang, Qiang
    Jiang, Bohan
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2021, 6 (04) : 493 - 503