Large-scale UAV swarm confrontation based on hierarchical attention actor-critic algorithm

被引：6

作者：

Nian, Xiaohong ^{[1
]}

Li, Mengmeng ^{[1
]}

Wang, Haibo ^{[1
]}

Gong, Yalei ^{[1
]}

Xiong, Hongyun ^{[1
]}

机构：

[1] Cent South Univ, Clustered Unmanned Syst Res Inst, Sch Automat, Changsha 410073, Hunan, Peoples R China

来源：

APPLIED INTELLIGENCE | 2024年 / 54卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Large-scale UAV swarm confrontation; Hierarchical attention actor-critic; Multi-agent reinforcement learning;

D O I：

10.1007/s10489-024-05293-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In large-scale unmanned aerial vehicle (UAV) swarm confrontation scenarios, the design of decision-making and coordination strategies becomes extremely difficult. Multi-Agent Reinforcement Learning (MARL), as a novel decision-making approach to address this issue, faces challenges such as poor scalability and the curse of dimensionality. To overcome these challenges, the paper proposes a Hierarchical Attention Actor-Critic (HAAC) algorithm. The HAAC algorithm includes a centralized critic network based on a Hierarchical Two-stage Attention Network (H2ANet), along with a hierarchical actor policy network that combines rules and reinforcement learning approaches. H2ANet is specifically designed to model the relationships between UAVs and extract crucial information from neighboring UAVs, enabling the generation of advanced cooperative and competitive strategies. The HAAC algorithm effectively reduces the dimensionality of both action and state spaces. Experimental results conducted demonstrate that the HAAC algorithm outperforms existing methods and is able to extend its learned policies to large-scale scenarios.

引用

页码：3279 / 3294

页数：16

共 50 条

[21] AN ACTOR-CRITIC REINFORCEMENT LEARNING ALGORITHM BASED ON ADAPTIVE RBF NETWORK
Li, Chun-Gui
Wang, Meng
Huang, Zhen-Jin
Zhang, Zeng-Fang
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 984 - 988
[22] Convergent Distributed Actor-Critic Algorithm Based on Gradient Temporal Difference
Stankovic, Milos S.
Beko, Marko
Stankovic, Srdjan S.
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 2066 - 2070
[23] A path planning algorithm based on actor-critic model in complex environments
Yu, Jiufang
Yao, Haichang
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (05)
[24] Heterogeneous Edge Caching Based on Actor-Critic Learning With Attention Mechanism Aiding
Wang, Chenyang
Li, Ruibin
Wang, Xiaofei
Taleb, Tarik
Guo, Song
Sun, Yuxia
Leung, Victor C. M.
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (06): : 3409 - 3420
[25] Actor-critic learning-based energy optimization for UAV access and backhaul networks
Yaxiong Yuan
Lei Lei
Thang X. Vu
Symeon Chatzinotas
Sumei Sun
Björn Ottersten
EURASIP Journal on Wireless Communications and Networking, 2021
[26] Actor-critic learning-based energy optimization for UAV access and backhaul networks
Yuan, Yaxiong
Lei, Lei
Vu, Thang X.
Chatzinotas, Symeon
Sun, Sumei
Ottersten, Bjorn
EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2021, 2021 (01)
[27] Multi-feature fusion prediction of steelmaking by-product gas based on hierarchical attention actor-critic network
Dong, Tianhao
Wang, Tianyu
Zhao, Jun
Wang, Wei
2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1377 - 1382
[28] A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation
Tang, Hengliang
Wang, Anqi
Xue, Fei
Yang, Jiaxin
Cao, Yang
IEEE ACCESS, 2021, 9 : 42568 - 42582
[29] A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation
Tang, Hengliang
Wang, Anqi
Xue, Fei
Yang, Jiaxin
Cao, Yang
IEEE Access, 2021, 9 : 42568 - 42582
[30] Task Offloading Optimization Based on Actor-Critic Algorithm in Vehicle Edge Computing
Wang, Bingxin
Liu, Lei
Wang, Jie
2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 687 - 692

← 1 2 3 4 5 →