Modular Reinforcement Learning for Playing the Game of Tron

被引：0

作者：

Jeon, Mingi ^{[1
]}

Lee, Jay ^{[2
]}

Ko, Sang-Ki ^{[1
]}

机构：

[1] Kangwon Natl Univ, Dept Comp Sci & Engn, Chunchon 24341, Gangwon Do, South Korea

[2] Hana Bank, Data & Investment Div, Seoul 04523, South Korea

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Modular learning; reinforcement learning; Tron; non-stationary environment; SHOGI; CHESS; LEVEL; GO;

D O I：

10.1109/ACCESS.2022.3175299

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Tron is a simultaneous move two-player game where a wall is created along the path where two agents move and the agent that crash with the wall first is defeated. Due to the fact that the same action may result in different outcomes (non-stationarity), it is difficult to utilize the basic approach of reinforcement learning. In this paper, we present a modular reinforcement learning (MRL) approach to tackling the game of Tron by decomposing the game into two phases where the first phase is non-stationary and the second phase is stationary. We train two separate models where the first model deals with the non-stationary environments such that two models move simultaneously and affect each other while the second model deals with the stationary environment when two agents are separated by walls created and cannot affect each other. We show that the latter model can be effectively pre-trained using randomly generated stationary environments. We evaluate the performance of our algorithm by comparing with previous algorithms including the state-of-the-art algorithm for the game of Tron (called a1k0n) in different grid sizes. As a result, we demonstrate that the proposed algorithm based on MRL outperforms all previous algorithms on 6 x 6 and 8 x 8 grids. Although our algorithm shows slightly worse performance on 10 x 10 grid than the strongest baseline a1k0n, we show that our algorithm exhibits better scalability in terms of time complexity as the grid size increases than search-based heuristics including the a1k0n.

引用

页码：63394 / 63402

页数：9

共 50 条

[1] Playing the Game of Congklak with Reinforcement Learning
Kasim, Muhammad Firmansyah
[J]. PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2016,
[2] Social Reinforcement Learning in Game Playing
Kiourt, Chairi
Kalles, Dimitris
[J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 322 - 326
[3] Playing Mastermind Game by using Reinforcement Learning
Lu, Wei-Fu
Yang, Ji-Kai
Chu, Hsueh-Ting
[J]. 2017 FIRST IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC), 2017, : 418 - 421
[4] Deep Reinforcement Learning for General Game Playing
Goldwaser, Adrian
Thielscher, Michael
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 1701 - 1708
[5] General game-playing and reinforcement learning
Levinson, R
[J]. COMPUTATIONAL INTELLIGENCE, 1996, 12 (01) : 155 - 176
[6] Comparison of Deep Reinforcement Learning Approaches for Intelligent Game Playing
Jeerige, Anoop
Bein, Doina
Verma, Abhishek
[J]. 2019 IEEE 9TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2019, : 366 - 371
[7] Playing a Strategy Game with Knowledge-Based Reinforcement Learning
Voss V.
Nechepurenko L.
Schaefer R.
Bauer S.
[J]. SN Computer Science, 2020, 1 (2)
[8] Hierarchical Reinforcement Learning for Playing a Dynamic Dungeon Crawler Game
Niel, Remi
Wiering, Marco A.
[J]. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1159 - 1166
[9] Playing a FPS Doom Video Game with Deep Visual Reinforcement Learning
Khan, Adil
Jiang, Feng
Liu, Shaohui
Omara, Ibrahim
[J]. AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2019, 53 (03) : 214 - 222
[10] Playing a FPS Doom Video Game with Deep Visual Reinforcement Learning
Feng Adil Khan
Shaohui Jiang
Ibrahim Liu
[J]. Automatic Control and Computer Sciences, 2019, 53 : 214 - 222

← 1 2 3 4 5 →