Learning How to Play Bomberman with Deep Reinforcement and Imitation Learning

被引：1

作者：

Goulart, Icaro ^{[1
]}

Paes, Aline ^{[1
]}

Clua, Esteban ^{[1
]}

机构：

[1] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil

来源：

ENTERTAINMENT COMPUTING AND SERIOUS GAMES, ICEC-JCSG 2019 | 2019年 / 11863卷

关键词：

Bomberman; Proximal Policy Optimization; Reinforcement Learning; LSTM; Imitation Learning;

D O I：

10.1007/978-3-030-34644-7_10

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Making artificial agents that learn how to play is a long-standing goal in the area of Game AI. Recently, several successful cases have emerged driven by Reinforcement Learning (RL) and neural network-based approaches. However, in most of the cases, the results have been achieved by training directly from pixel frames with valuable computational resources. In this paper, we devise agents that learn how to play the popular game of Bomberman by relying on state representations and RL-based algorithms without looking at the pixel level. To that, we designed five vector-based state representations and implemented Bomberman on the top of the Unity game engine through the ML-agents toolkit. We enhance the ML-agents algorithms by developing an Imitation-based learner (IL) that improves its model with the Actor-Critic Proximal-Policy Optimization (PPO) method. We compared this approach with a PPO-only learner that uses either a Multi-Layer Perceptron or a Long-Short Term-Memory network (LSTM). We conducted several pieces of training and tournament experiments by making the agents play against each other. The hybrid state representation and our IL followed by PPO learning algorithm achieve the best overall quantitative results, and we also observed that their agents learn a correct Bomberman behavior.

引用

页码：121 / 133

页数：13

共 50 条

[1] Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning
Hua, Jiang
Zeng, Liangcai
Li, Gongfa
Ju, Zhaojie
[J]. Sensors (Switzerland), 2021, 21 (04): : 1 - 21
[2] Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning
Hua, Jiang
Zeng, Liangcai
Li, Gongfa
Ju, Zhaojie
[J]. SENSORS, 2021, 21 (04) : 1 - 21
[3] Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
Gros, Timo P.
Hoeller, Daniel
Hoffmann, Joerg
Wolf, Verena
[J]. QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2020), 2020, 12289 : 11 - 17
[4] A Penetration Strategy Combining Deep Reinforcement Learning and Imitation Learning
Wang, Xiaofang
Gu, Kunren
[J]. Yuhang Xuebao/Journal of Astronautics, 2023, 44 (06): : 914 - 925
[5] Cloud Resource Scheduling With Deep Reinforcement Learning and Imitation Learning
Guo, Wenxia
Tian, Wenhong
Ye, Yufei
Xu, Lingxiao
Wu, Kui
[J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (05): : 3576 - 3586
[6] Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid
Veith, Eric Msp
Logemann, Torben
Berezin, Aleksandr
Wellssow, Arlena
Balduin, Stephan
[J]. 2024 12TH WORKSHOP ON MODELING AND SIMULATION OF CYBER-PHYSICAL ENERGY SYSTEMS, MSCPES, 2024,
[7] Deep imitation reinforcement learning with expert demonstration data
Yi, Menglong
Xu, Xin
Zeng, Yujun
Jung, Seul
[J]. JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1567 - 1573
[8] Learning How to Actively Learn: A Deep Imitation Learning Approach
Liu, Ming
Buntine, Wray
Haffari, Gholamreza
[J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1874 - 1883
[9] Building Safe and Stable DNN Controllers using Deep Reinforcement Learning and Deep Imitation Learning
He, Xudong
[J]. 2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 775 - 784
[10] Deep imitation reinforcement learning for self-driving by vision
Zou, Qijie
Xiong, Kang
Fang, Qiang
Jiang, Bohan
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2021, 6 (04) : 493 - 503

← 1 2 3 4 5 →