Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

被引：1

作者：

Goulart, Icaro ^{[1
]}

Paes, Aline ^{[1
]}

Clua, Esteban ^{[1
]}

机构：

[1] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil

来源：

ENTERTAINMENT COMPUTING AND SERIOUS GAMES, ICEC-JCSG 2019 | 2019年 / 11863卷

关键词：

Bomberman; Proximal Policy Optimization; Reinforcement Learning; LSTM; Imitation Learning;

D O I：

10.1007/978-3-030-34644-7_10

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Making artificial agents that learn how to play is a long-standing goal in the area of Game AI. Recently, several successful cases have emerged driven by Reinforcement Learning (RL) and neural network-based approaches. However, in most of the cases, the results have been achieved by training directly from pixel frames with valuable computational resources. In this paper, we devise agents that learn how to play the popular game of Bomberman by relying on state representations and RL-based algorithms without looking at the pixel level. To that, we designed five vector-based state representations and implemented Bomberman on the top of the Unity game engine through the ML-agents toolkit. We enhance the ML-agents algorithms by developing an Imitation-based learner (IL) that improves its model with the Actor-Critic Proximal-Policy Optimization (PPO) method. We compared this approach with a PPO-only learner that uses either a Multi-Layer Perceptron or a Long-Short Term-Memory network (LSTM). We conducted several pieces of training and tournament experiments by making the agents play against each other. The hybrid state representation and our IL followed by PPO learning algorithm achieve the best overall quantitative results, and we also observed that their agents learn a correct Bomberman behavior.

引用

页码：121 / 133

页数：13

共 50 条

[21] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Morales, Eduardo F.
Murrieta-Cid, Rafael
Becerra, Israel
Esquivel-Basaldua, Marco A.
[J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
[22] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Eduardo F. Morales
Rafael Murrieta-Cid
Israel Becerra
Marco A. Esquivel-Basaldua
[J]. Intelligent Service Robotics, 2021, 14 : 773 - 805
[23] Learning How Pedestrians Navigate: A Deep Inverse Reinforcement Learning Approach
Fahad, Muhammad
Chen, Zhuo
Guo, Yi
[J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 819 - 826
[24] The Advance of Reinforcement Learning and Deep Reinforcement Learning
Lyu, Le
Shen, Yang
Zhang, Sicheng
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 644 - 648
[25] Implicit imitation in multiagent reinforcement learning
Price, B
Boutilier, C
[J]. MACHINE LEARNING, PROCEEDINGS, 1999, : 325 - 334
[26] Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning
Chen, Hanxiao
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15769 - 15770
[27] Learning to Play Precision Ball Sports from scratch: a Deep Reinforcement Learning Approach
Antao, Liliana
Sousa, Armando
Reis, Luis Paulo
Goncalves, Gil
[J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[28] Cooperative Control for Multi-Intersection Traffic Signal Based on Deep Reinforcement Learning and Imitation Learning
Huo, Yusen
Tao, Qinghua
Hu, Jianming
[J]. IEEE ACCESS, 2020, 8 : 199573 - 199585
[29] Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task
Shitov, Denis
Pirogova, Elena
Wysocki, Tadeusz A.
Lech, Margaret
[J]. SENSORS, 2023, 23 (07)
[30] Reinforcement learning building control approach harnessing imitation learning
Dey, Sourav
Marzullo, Thibault
Zhang, Xiangyu
Henze, Gregor
[J]. ENERGY AND AI, 2023, 14

← 1 2 3 4 5 →