Research on Action Strategies and Simulations of DRL and MCTS-based Intelligent Round Game

被引：4

作者：

Sun, Yuxiang ^{[1
]}

Yuan, Bo ^{[2
]}

Zhang, Yongliang ^{[4
]}

Zheng, Wanwen ^{[3
]}

Xia, Qingfeng ^{[3
]}

Tang, Bojian ^{[3
]}

Zhou, Xianzhong ^{[3
]}

机构：

[1] Nanjing Univ, Coll Engn Management, 22 Hankou Rd, Nanjing, Jiangsu, Peoples R China

[2] Derby Univ, Sch Comp & Engn, Derby, England

[3] Nanjing Univ, Sch Engn Management, 22 Hankou Rd, Nanjing, Jiangsu, Peoples R China

[4] Army Engn Univ Nanjing, Nanjing, Jiangsu, Peoples R China

来源：

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS | 2021年 / 19卷 / 09期

关键词：

DDQN; deep reinforcement learning; MCTS; round game; CARLO TREE-SEARCH; ARCADE LEARNING-ENVIRONMENT; GO;

D O I：

10.1007/s12555-020-0277-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The reinforcement learning problem of complex action control in multiplayer online battlefield games has brought considerable interest in the deep learning field. This problem involves more complex states and action spaces than traditional confrontation games, making it difficult to search for any strategy with human-level performance. This paper presents a deep reinforcement learning model to solve this problem from the perspective of game simulations and algorithm implementation. A reverse reinforcement-learning model based on high-level player training data is established to support downstream algorithms. With less training data, the proposed model is converged quicker, and more consistent with the action strategies of high-level players' decision-making. Then an intelligent deduction algorithm based on DDQN is developed to achieve a better generalization ability under the guidance of a given reward function. At the game simulation level, this paper constructs Monte Carlo Tree Search Intelligent Decision Model for turn-based antagonistic deduction games to generate next-step actions. Furthermore, a prototype game simulator that combines offline with online functions is implemented to verify the performance of proposed model and algorithm. The experiments show that our proposed approach not only has a better reference value to the antagonistic environment using incomplete information, but also accurate and effective in predicting the return value. Moreover, our work provides a theoretical validation platform and testbed for related research on game AI for deductive games.

引用

页码：2984 / 2998

页数：15

共 43 条

[41] Promoting Negotiative and Relational Caring: An Action Research Study of Hospital-Based Nurse Educators' Reports of the Instructional Strategies Used In Professional Development Curricula
Guisti, Robin Louise
NURSING RESEARCH, 2013, 62 (02) : E4 - E4
[42] A 5-Year Action Research Project Investigating Coach and Athlete Perceptions of a Game-Based Coaching Approach in High-Performance Domestic Women's Field Hockey
Vinson, D.
RESEARCH QUARTERLY FOR EXERCISE AND SPORT, 2016, 87 : S85 - S85
[43] Healing the Past by Nurturing the Future-co-designing perinatal strategies for Aboriginal and Torres Strait Islander parents experiencing complex trauma: framework and protocol for a community-based participatory action research study
Chamberlain, Catherine
Gee, Graham
Brown, Stephanie Janne
Atkinson, Judith
Herrman, Helen
Gartland, Deirdre
Glover, Karen
Clark, Yvonne
Campbell, Sandra
Mensah, Fiona K.
Atkinson, Caroline
Brennan, Sue E.
McLachlan, Helen
Hirvonen, Tanja
Dyall, Danielle
Ralph, Naomi
Hokke, Stacey
Nicholson, Jan
BMJ OPEN, 2019, 9 (06):

← 1 2 3 4 5 →