Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

被引:1
|
作者
Goulart, Icaro [1 ]
Paes, Aline [1 ]
Clua, Esteban [1 ]
机构
[1] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil
关键词
Bomberman; Proximal Policy Optimization; Reinforcement Learning; LSTM; Imitation Learning;
D O I
10.1007/978-3-030-34644-7_10
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Making artificial agents that learn how to play is a long-standing goal in the area of Game AI. Recently, several successful cases have emerged driven by Reinforcement Learning (RL) and neural network-based approaches. However, in most of the cases, the results have been achieved by training directly from pixel frames with valuable computational resources. In this paper, we devise agents that learn how to play the popular game of Bomberman by relying on state representations and RL-based algorithms without looking at the pixel level. To that, we designed five vector-based state representations and implemented Bomberman on the top of the Unity game engine through the ML-agents toolkit. We enhance the ML-agents algorithms by developing an Imitation-based learner (IL) that improves its model with the Actor-Critic Proximal-Policy Optimization (PPO) method. We compared this approach with a PPO-only learner that uses either a Multi-Layer Perceptron or a Long-Short Term-Memory network (LSTM). We conducted several pieces of training and tournament experiments by making the agents play against each other. The hybrid state representation and our IL followed by PPO learning algorithm achieve the best overall quantitative results, and we also observed that their agents learn a correct Bomberman behavior.
引用
收藏
页码:121 / 133
页数:13
相关论文
共 50 条
  • [21] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Morales, Eduardo F.
    Murrieta-Cid, Rafael
    Becerra, Israel
    Esquivel-Basaldua, Marco A.
    [J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
  • [22] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Eduardo F. Morales
    Rafael Murrieta-Cid
    Israel Becerra
    Marco A. Esquivel-Basaldua
    [J]. Intelligent Service Robotics, 2021, 14 : 773 - 805
  • [23] Learning How Pedestrians Navigate: A Deep Inverse Reinforcement Learning Approach
    Fahad, Muhammad
    Chen, Zhuo
    Guo, Yi
    [J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 819 - 826
  • [24] The Advance of Reinforcement Learning and Deep Reinforcement Learning
    Lyu, Le
    Shen, Yang
    Zhang, Sicheng
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 644 - 648
  • [25] Implicit imitation in multiagent reinforcement learning
    Price, B
    Boutilier, C
    [J]. MACHINE LEARNING, PROCEEDINGS, 1999, : 325 - 334
  • [26] Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning
    Chen, Hanxiao
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15769 - 15770
  • [27] Learning to Play Precision Ball Sports from scratch: a Deep Reinforcement Learning Approach
    Antao, Liliana
    Sousa, Armando
    Reis, Luis Paulo
    Goncalves, Gil
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [28] Cooperative Control for Multi-Intersection Traffic Signal Based on Deep Reinforcement Learning and Imitation Learning
    Huo, Yusen
    Tao, Qinghua
    Hu, Jianming
    [J]. IEEE ACCESS, 2020, 8 : 199573 - 199585
  • [29] Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task
    Shitov, Denis
    Pirogova, Elena
    Wysocki, Tadeusz A.
    Lech, Margaret
    [J]. SENSORS, 2023, 23 (07)
  • [30] Reinforcement learning building control approach harnessing imitation learning
    Dey, Sourav
    Marzullo, Thibault
    Zhang, Xiangyu
    Henze, Gregor
    [J]. ENERGY AND AI, 2023, 14