UAV intelligent attack strategy generation model based on multi-agent game reinforcement learning

被引:1
|
作者
Zhao Z. [1 ]
Cao L. [1 ]
Chen X. [1 ]
Lai J. [1 ]
Zhang L. [1 ]
机构
[1] Command and Control Engineering College-, Army Engineering University of PLA, Nanjing
关键词
Markov stochastic game; multi-agent game reinforcement learning; tactical strategy; unmanned aerial vehicle (UAV);
D O I
10.12305/j.issn.1001-506X.2023.10.21
中图分类号
学科分类号
摘要
How to utilize new combat forces represented by offensive unmanned aerial vehicle (UAV) to enhance combat effectiveness is one of the focuses of intelligent and unmanned warfare research. This article is based on the key technology of UAV intelligent attack using multi-agent game reinforcement learning, as well as the basic concept of Markov random games. A model for generating UAV intelligent attack strategies based on multi-agent game reinforcement learning is established, and an optimization method is proposed using the "trembling hand perfect" idea in the game theory to improve the strategy model. Simulation experiments show that the optimized algorithm has improved the original algorithm, and the trained model can generate various real-time attack tactics, which has strong practical significance for intelligent command and control. © 2023 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:3165 / 3171
页数:6
相关论文
共 33 条
  • [1] SUN Y, LI Q W, XU Z X, Et al., Game confrontation strategy training model for air combat based on multi agent deep reinforcement learning, Command Information System and Technology, 12, 2, pp. 16-20, (2021)
  • [2] CHEN X L, CAO L, SHEN C., Research on action sequence planning based on deep inverse reinforcement learning, National Defense Science & Technology, 40, 4, pp. 55-61, (2019)
  • [3] CAO L, SUN Y, CHEN XL, Et al., Key technology and application of intelligent mission planning in joint operations, National Defense Science & Technology, 41, 3, pp. 49-56, (2020)
  • [4] CAO L, CHEN X L, TANG W., Intelligent army construction, National Defense Science & Technology, 40, 4, pp. 14-19, (2019)
  • [5] CHEN XL, LI Q W, SUN Y., Key technologies for air combat intelligent decision based on game confrontation, Command Information System and Technology, 12, 2, (2021)
  • [6] SUNEHAGP, LEVER G, GRUSLYS A, Et al., Value-decomposition networks for cooperative multi-agent learning [C], Proc. of the 17th International Conference on Autonomous Agents and Multiagent Systems, pp. 10-15, (2018)
  • [7] RASHID T, SAMVELYAN M, WITT C D, Et al., QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning [C], Proc. of the 35th International Conference on Machine Learning, pp. 4295-4304, (2018)
  • [8] YANG Y, RUI L, LI M, Et al., Mean field multi-agent reinforcement learning[C], Proc. of the 35th International Conference on Machine Learning, pp. 5571-5580, (2018)
  • [9] FOERSTER J N, CHEN R Y, AL-SHED1VAT M, Et al., Learning with opponent-learning awareness, Proc. of the 17th International Conference on Autonomous Agents and Multi Agent Systems, pp. 122-130, (2017)
  • [10] PENGP, WEN Y, YANG Y, Et al., Multiagent bldirectionally-co-ordinated nets: emergence of human-level coordination in learning to play starcraft combat games [EB/OL]