Efficiently Mastering the Game of NoGo with Deep Reinforcement Learning Supported by Domain Knowledge

被引：3

作者：

Gao, Yifan ^{[1
]}

Wu, Lezhou ^{[2
]}

机构：

[1] Northeastern Univ, Coll Med & Biol Informat Engn, Shenyang 110819, Liaoning, Peoples R China

[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China

来源：

ELECTRONICS | 2021年 / 10卷 / 13期

关键词：

artificial intelligence; deep learning; AlphaZero; NoGo games; reinforcement learning; GO;

D O I：

10.3390/electronics10131533

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computer games have been regarded as an important field of artificial intelligence (AI) for a long time. The AlphaZero structure has been successful in the game of Go, beating the top professional human players and becoming the baseline method in computer games. However, the AlphaZero training process requires tremendous computing resources, imposing additional difficulties for the AlphaZero-based AI. In this paper, we propose NoGoZero+ to improve the AlphaZero process and apply it to a game similar to Go, NoGo. NoGoZero+ employs several innovative features to improve training speed and performance, and most improvement strategies can be transferred to other nonspecific areas. This paper compares it with the original AlphaZero process, and results show that NoGoZero+ increases the training speed to about six times that of the original AlphaZero process. Moreover, in the experiment, our agent beat the original AlphaZero agent with a score of 81:19 after only being trained by 20,000 self-play games' data (small in quantity compared with 120,000 self-play games' data consumed by the original AlphaZero). The NoGo game program based on NoGoZero+ was the runner-up in the 2020 China Computer Game Championship (CCGC) with limited resources, defeating many AlphaZero-based programs. Our code, pretrained models, and self-play datasets are publicly available. The ultimate goal of this paper is to provide exploratory insights and mature auxiliary tools to enable AI researchers and computer-game communities to study, test, and improve these promising state-of-the-art methods at a much lower cost of computing resources.

引用

页数：16

共 50 条

[1] Mastering air combat game with deep reinforcement learning
Zhu, Jingyu
Kuang, Minchi
Zhou, Wenqing
Shi, Heng
Zhu, Jihong
Han, Xu
[J]. DEFENCE TECHNOLOGY, 2024, 34 : 295 - 312
[2] Mastering air combat game with deep reinforcement learning
Jingyu Zhu
Minchi Kuang
Wenqing Zhou
Heng Shi
Jihong Zhu
Xu Han
[J]. Defence Technology, 2024, 34 (04) : 295 - 312
[3] SCC: an Efficient Deep Reinforcement Learning Agent Mastering the Game of StarCraft II
Wang, Xiangjun
Song, Junxiao
Qi, Penghui
Peng, Peng
Tang, Zhenkun
Zhang, Wei
Li, Weimin
Pi, Xiongjun
He, Jujie
Gao, Chao
Long, Haitao
Yuan, Quan
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
Kim, Dae-Wook
Park, Sungyun
Yang, Seong-il
[J]. 2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 576 - 583
[5] Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
Liu, Jiayi
Wang, Gang
Guo, Xiangke
Wang, Siyuan
Fu, Qiang
[J]. IEEE Access, 2022, 10 : 114402 - 114413
[6] Leveraging Domain Knowledge for Robust Deep Reinforcement Learning in Networking
Zheng, Ying
Chen, Haoyu
Duan, Qingyang
Lin, Lixiang
Shao, Yiyang
Wang, Wei
Wang, Xin
Xu, Yuedong
[J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[7] Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
Liu, Jiayi
Wang, Gang
Guo, Xiangke
Wang, Siyuan
Fu, Qiang
[J]. IEEE ACCESS, 2022, 10 : 114402 - 114413
[8] Mastering the game of Stratego with model-free multiagent reinforcement learning
Perolat, Julien
De Vylder, Bart
Hennes, Daniel
Tarassov, Eugene
Strub, Florian
de Boer, Vincent
Muller, Paul
Connor, Jerome T.
Burch, Neil
Anthony, Thomas
McAleer, Stephen
Elie, Romuald
Cen, Sarah H.
Wang, Zhe
Gruslys, Audrunas
Malysheva, Aleksandra
Khan, Mina
Ozair, Sherjil
Timbers, Finbarr
Pohlen, Toby
Eccles, Tom
Rowland, Mark
Lanctot, Marc
Lespiau, Jean-Baptiste
Piot, Bilal
Omidshafiei, Shayegan
Lockhart, Edward
Sifre, Laurent
Beauguerlange, Nathalie
Munos, Remi
Silver, David
Singh, Satinder
Hassabis, Demis
Tuyls, Karl
[J]. SCIENCE, 2022, 378 (6623) : 990 - +
[9] Mastering Complex Control in MOBA Games with Deep Reinforcement Learning
Ye, Deheng
Liu, Zhao
Sun, Mingfei
Shi, Bei
Zhao, Peilin
Wu, Hao
Yu, Hongsheng
Yang, Shaojie
Wu, Xipeng
Guo, Qingwei
Chen, Qiaobo
Yin, Yinyuting
Zhang, Hao
Shi, Tengfei
Wang, Liang
Fu, Qiang
Yang, Wei
Huang, Lanxiao
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6672 - 6679
[10] Topology Optimization of Power Systems Combining Deep Reinforcement Learning and Domain Knowledge
Yan, Ziming
Xu, Yan
[J]. Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2022, 46 (01): : 60 - 68

← 1 2 3 4 5 →