Pursuit-evasion game strategy of USV based on deep reinforcement learning in complex multi-obstacle environment

被引：18

作者：

Qu, Xiuqing ^{[1
]}

Gan, Wenhao ^{[1
]}

Song, Dalei ^{[1
,2
]}

Zhou, Liqin ^{[1
]}

机构：

[1] Ocean Univ China, Coll Engn, 238 Songling Rd, Qingdao 266100, Shandong, Peoples R China

[2] Ocean Univ China, Inst Adv Ocean Study, 238 Songling Rd, Qingdao 266100, Shandong, Peoples R China

来源：

OCEAN ENGINEERING | 2023年 / 273卷

关键词：

Unmanned surface vehicles; Pursuit-evasion game; Deep reinforcement learning; Imitation learning; Obstacle avoidance;

D O I：

10.1016/j.oceaneng.2023.114016

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Aiming at the confrontation game problems between pursuit-evasion unmanned surface vehicles under complex multi-obstacle environment, a pursuit-evasion game strategy is proposed. Firstly, the multi-obstacle environment is set up, and the gaming situation can be judged by the perception between pursuit-evasion USVs. For the pursuers, the model training is performed based on multi-agent deep reinforcement learning, so that they can quickly plan a reasonable obstacle avoidance and pursuit route, and form an effective encirclement posture before the evader approaches the target point. Meanwhile, the credit assignment problem among the members of the pursuing group is considered. For the evader, deep reinforcement learning is combined with imitation learning to train the escape model, so that it can reach the preset point in as short a time as possible and avoid the obstacles on the way. Finally, an adversarial-evolutionary game training method under multiple random scenarios is designed and combined with curriculum learning to iteratively update the pursuit and escape models. Through the detailed comparative analysis of the model training process and simulation experiments, it is proved that the proposed two types of models have higher convergence efficiency and stability, and they can have higher intelligence to pursue, escape and avoid obstacles respectively.

引用

页数：20

共 50 条

[21] Near-optimal interception strategy for orbital pursuit-evasion using deep reinforcement learning
Zhang, Jingrui
Zhang, Kunpeng
Zhang, Yao
Shi, Heng
Tang, Liang
Li, Mou
[J]. ACTA ASTRONAUTICA, 2022, 198 : 9 - 25
[22] Terminal-guidance Based Reinforcement-learning for Orbital Pursuit-evasion Game of the Spacecraft
Geng Y.-Z.
Yuan L.
Huang H.
Tang L.
[J]. Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (05): : 974 - 984
[23] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
Wan, Kaifang
Wu, Dingwei
Zhai, Yiwei
Li, Bo
Gao, Xiaoguang
Hu, Zijian
[J]. ENTROPY, 2021, 23 (11)
[24] Integral reinforcement learning based dynamic stackelberg pursuit-evasion game for unmanned surface vehicles
Hu, Xiaoxiang
Liu, Shuaizheng
Xu, Jingwen
Xiao, Bing
Guo, Chenguang
[J]. ALEXANDRIA ENGINEERING JOURNAL, 2024, 108 : 428 - 435
[25] Game-Theoretic Analysis of a Visibility Based Pursuit-Evasion Game in the Presence of a Circular Obstacle
Bhattacharya, S.
Basar, T.
Hovakimyan, N.
[J]. NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2012), VOLS A AND B, 2012, 1479 : 1222 - 1225
[26] Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach
Zhang, Sitong
Li, Yibing
Dong, Qianhui
[J]. Applied Soft Computing, 2022, 115
[27] Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach
Zhang, Sitong
Li, Yibing
Dong, Qianhui
[J]. APPLIED SOFT COMPUTING, 2022, 115
[28] A UAV Pursuit-Evasion Strategy Based on DDPG and Imitation Learning
Fu, Xiaowei
Zhu, Jindong
Wei, Zhaoying
Wang, Hui
Li, Sili
[J]. INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2022, 2022
[29] Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits
Yu, Weizhuo
Liu, Chuang
Yue, Xiaokui
[J]. CONTROL ENGINEERING PRACTICE, 2024, 153
[30] Obstacle avoidance USV in multi-static obstacle environments based on a deep reinforcement learning approach
Jiang, Dengyao
Yuan, Mingzhe
Xiong, Junfeng
Xiao, Jinchao
Duan, Yong
[J]. MEASUREMENT & CONTROL, 2024, 57 (04): : 415 - 427

← 1 2 3 4 5 →