Efficiently Mastering the Game of NoGo with Deep Reinforcement Learning Supported by Domain Knowledge

被引:3
|
作者
Gao, Yifan [1 ]
Wu, Lezhou [2 ]
机构
[1] Northeastern Univ, Coll Med & Biol Informat Engn, Shenyang 110819, Liaoning, Peoples R China
[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
关键词
artificial intelligence; deep learning; AlphaZero; NoGo games; reinforcement learning; GO;
D O I
10.3390/electronics10131533
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computer games have been regarded as an important field of artificial intelligence (AI) for a long time. The AlphaZero structure has been successful in the game of Go, beating the top professional human players and becoming the baseline method in computer games. However, the AlphaZero training process requires tremendous computing resources, imposing additional difficulties for the AlphaZero-based AI. In this paper, we propose NoGoZero+ to improve the AlphaZero process and apply it to a game similar to Go, NoGo. NoGoZero+ employs several innovative features to improve training speed and performance, and most improvement strategies can be transferred to other nonspecific areas. This paper compares it with the original AlphaZero process, and results show that NoGoZero+ increases the training speed to about six times that of the original AlphaZero process. Moreover, in the experiment, our agent beat the original AlphaZero agent with a score of 81:19 after only being trained by 20,000 self-play games' data (small in quantity compared with 120,000 self-play games' data consumed by the original AlphaZero). The NoGo game program based on NoGoZero+ was the runner-up in the 2020 China Computer Game Championship (CCGC) with limited resources, defeating many AlphaZero-based programs. Our code, pretrained models, and self-play datasets are publicly available. The ultimate goal of this paper is to provide exploratory insights and mature auxiliary tools to enable AI researchers and computer-game communities to study, test, and improve these promising state-of-the-art methods at a much lower cost of computing resources.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Mastering air combat game with deep reinforcement learning
    Zhu, Jingyu
    Kuang, Minchi
    Zhou, Wenqing
    Shi, Heng
    Zhu, Jihong
    Han, Xu
    [J]. DEFENCE TECHNOLOGY, 2024, 34 : 295 - 312
  • [2] Mastering air combat game with deep reinforcement learning
    Jingyu Zhu
    Minchi Kuang
    Wenqing Zhou
    Heng Shi
    Jihong Zhu
    Xu Han
    [J]. Defence Technology, 2024, 34 (04) : 295 - 312
  • [3] SCC: an Efficient Deep Reinforcement Learning Agent Mastering the Game of StarCraft II
    Wang, Xiangjun
    Song, Junxiao
    Qi, Penghui
    Peng, Peng
    Tang, Zhenkun
    Zhang, Wei
    Li, Weimin
    Pi, Xiongjun
    He, Jujie
    Gao, Chao
    Long, Haitao
    Yuan, Quan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
    Kim, Dae-Wook
    Park, Sungyun
    Yang, Seong-il
    [J]. 2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 576 - 583
  • [5] Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
    Liu, Jiayi
    Wang, Gang
    Guo, Xiangke
    Wang, Siyuan
    Fu, Qiang
    [J]. IEEE Access, 2022, 10 : 114402 - 114413
  • [6] Leveraging Domain Knowledge for Robust Deep Reinforcement Learning in Networking
    Zheng, Ying
    Chen, Haoyu
    Duan, Qingyang
    Lin, Lixiang
    Shao, Yiyang
    Wang, Wei
    Wang, Xin
    Xu, Yuedong
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [7] Deep Reinforcement Learning Task Assignment Based on Domain Knowledge
    Liu, Jiayi
    Wang, Gang
    Guo, Xiangke
    Wang, Siyuan
    Fu, Qiang
    [J]. IEEE ACCESS, 2022, 10 : 114402 - 114413
  • [8] Mastering the game of Stratego with model-free multiagent reinforcement learning
    Perolat, Julien
    De Vylder, Bart
    Hennes, Daniel
    Tarassov, Eugene
    Strub, Florian
    de Boer, Vincent
    Muller, Paul
    Connor, Jerome T.
    Burch, Neil
    Anthony, Thomas
    McAleer, Stephen
    Elie, Romuald
    Cen, Sarah H.
    Wang, Zhe
    Gruslys, Audrunas
    Malysheva, Aleksandra
    Khan, Mina
    Ozair, Sherjil
    Timbers, Finbarr
    Pohlen, Toby
    Eccles, Tom
    Rowland, Mark
    Lanctot, Marc
    Lespiau, Jean-Baptiste
    Piot, Bilal
    Omidshafiei, Shayegan
    Lockhart, Edward
    Sifre, Laurent
    Beauguerlange, Nathalie
    Munos, Remi
    Silver, David
    Singh, Satinder
    Hassabis, Demis
    Tuyls, Karl
    [J]. SCIENCE, 2022, 378 (6623) : 990 - +
  • [9] Mastering Complex Control in MOBA Games with Deep Reinforcement Learning
    Ye, Deheng
    Liu, Zhao
    Sun, Mingfei
    Shi, Bei
    Zhao, Peilin
    Wu, Hao
    Yu, Hongsheng
    Yang, Shaojie
    Wu, Xipeng
    Guo, Qingwei
    Chen, Qiaobo
    Yin, Yinyuting
    Zhang, Hao
    Shi, Tengfei
    Wang, Liang
    Fu, Qiang
    Yang, Wei
    Huang, Lanxiao
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6672 - 6679
  • [10] Topology Optimization of Power Systems Combining Deep Reinforcement Learning and Domain Knowledge
    Yan, Ziming
    Xu, Yan
    [J]. Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2022, 46 (01): : 60 - 68