An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning

被引：20

作者：

Wan, Kaifang ^{[1
]}

Wu, Dingwei ^{[1
]}

Zhai, Yiwei ^{[1
]}

Li, Bo ^{[1
]}

Gao, Xiaoguang ^{[1
]}

Hu, Zijian ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710072, Peoples R China

来源：

ENTROPY | 2021年 / 23卷 / 11期

基金：

中国国家自然科学基金;

关键词：

pursuit-evasion; multi-agent; deep reinforcement learning; decision-making; adversarial learning; MADDPG;

D O I：

10.3390/e23111433

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

A pursuit-evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit-evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

引用

页数：22

共 50 条

[31] On Developing a UAV Pursuit-Evasion Policy Using Reinforcement Learning
Vlahov, Bogdan
Squires, Eric
Strickland, Laura
Pippin, Charles
[J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 859 - 864
[32] Adaptive Optimal Control via Q-Learning for Multi-Agent Pursuit-Evasion Games
Dong, Xu
Zhang, Huaguang
Ming, Zhongyang
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (06) : 3056 - 3060
[33] Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning
Guo, Delin
Tang, Lan
Yang, Lvxi
Liang, Ying-Chang
[J]. IEEE Workshop on Signal Processing Advances in Wireless Communications, SPAWC, 2022, 2022-July
[34] Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning
Guo, Delin
Tang, Lan
Yang, Lvxi
Liang, Ying-Chang
[J]. 2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC), 2022,
[35] Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios
Liu, Qi
Li, Zirui
Li, Xueyuan
Wu, Jingda
Yuan, Shihua
[J]. IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2022, 2022-October : 4074 - 4081
[36] Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios
Liu, Qi
Li, Zirui
Li, Xueyuan
Wu, Jingda
Yuan, Shihua
[J]. 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 4074 - 4081
[37] Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
Zhou, Zejian
Xu, Hao
[J]. NEUROCOMPUTING, 2022, 484 : 46 - 58
[38] Near-optimal interception strategy for orbital pursuit-evasion using deep reinforcement learning
Zhang, Jingrui
Zhang, Kunpeng
Zhang, Yao
Shi, Heng
Tang, Liang
Li, Mou
[J]. ACTA ASTRONAUTICA, 2022, 198 : 9 - 25
[39] Event-triggered multi-agent credit allocation pursuit-evasion algorithm
Zhang, Bo-Kun
Hu, Bin
Zhang, Ding-Xue
Guan, Zhi-Hong
Cheng, Xin-Ming
[J]. NEURAL PROCESSING LETTERS, 2023, 55 (01) : 789 - 802
[40] Pursuer Assignment and Control Strategies in Multi-Agent Pursuit-Evasion Under Uncertainties
Zhang, Leiming
Prorok, Amanda
Bhattacharya, Subhrajit
[J]. FRONTIERS IN ROBOTICS AND AI, 2021, 8

← 1 2 3 4 5 →