Weighted mean field reinforcement learning for large-scale UAV swarm confrontation

被引:17
|
作者
Wang, Baolai [1 ]
Li, Shengang [1 ]
Gao, Xianzhong [2 ]
Xie, Tao [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Aerosp Sci & Engn, Changsha 410073, Hunan, Peoples R China
关键词
Multi-agent reinforcement learning; Unmanned aerial vehicle; Swarm confrontation; Attention mechanism; LEVEL;
D O I
10.1007/s10489-022-03840-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finding the optimal game strategy is a difficult problem in unmanned aerial vehicle (UAV) swarm confrontation. As an effective solution to the sequential decision-making problem, multi-agent reinforcement learning (MARL) provides a promising way to realize intelligent countermeasures. However, there are two challenges in applying MARL to large-scale UAV swarm confrontation: i) the curse of dimensionality caused by the excessive scale of UAV clusters and ii) the generalization problem caused by the dynamically changing UAV cluster size. To address these problems, we propose a novel MARL paradigm, called Weighted Mean Field Reinforcement Learning, where the pairwise communication between any UAV and its neighbors is modeled as that between a central UAV and the virtual UAV, which is abstracted from the weighted mean effect of neighboring UAVs. This approach reduces the multi-agent problem to a two-agent problem, which can reduce the input dimension of the agent and adapt to the changing cluster size. The communication content between UAVs includes actions and local observations. Actions can enhance the cooperation between UAVs and alleviate the non-stationarity of the environment, while local observations can expand the perception range of the central UAV so that it can obtain more useful information about the environment. The attention mechanism is leveraged to enable UAVs to select more valuable information flexibly, making our method more scalable than other algorithms. Combining this paradigm with double Q-learning and actor-critic algorithms, we propose weighted mean field Q-learning (WMFQ) and weighted mean field actor-critic (WMFAC) algorithms. Experiments on our constructed UAV swarm confrontation environment verify the effectiveness and scalability of our algorithms.
引用
收藏
页码:5274 / 5289
页数:16
相关论文
共 50 条
  • [21] Tractable large-scale deep reinforcement learning
    Sarang, Nima
    Poullis, Charalambos
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 232
  • [22] A reinforcement learning level-based particle swarm optimization algorithm for large-scale optimization
    Wang, Feng
    Wang, Xujie
    Sun, Shilei
    INFORMATION SCIENCES, 2022, 602 : 298 - 312
  • [23] A Mean-Field Game Control for Large-Scale Swarm Formation Flight in Dense Environments
    Wang, Guofang
    Yao, Wang
    Zhang, Xiao
    Li, Ziming
    SENSORS, 2022, 22 (14)
  • [24] Hierarchical Reinforcement Learning for Swarm Confrontation With High Uncertainty
    Wu, Qizhen
    Liu, Kexin
    Chen, Lei
    Lu, Jinhu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 8630 - 8644
  • [25] Mean Field Deep Reinforcement Learning for Fair and Efficient UAV Control
    Chen, Dezhi
    Qi, Qi
    Zhuang, Zirui
    Wang, Jingyu
    Liao, Jianxin
    Han, Zhu
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (02) : 813 - 828
  • [26] Intelligent Distributed Swarm Control for Large-Scale Multi-UAV Systems: A Hierarchical Learning Approach
    Dey, Shawon
    Xu, Hao
    ELECTRONICS, 2023, 12 (01)
  • [27] Optimization of large-scale UAV cluster confrontation game based on integrated evolution strategy
    Liu, Haiying
    Wu, Kun
    Huang, Kuihua
    Cheng, Guangquan
    Wang, Rui
    Liu, Guohua
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 515 - 529
  • [28] Optimization of large-scale UAV cluster confrontation game based on integrated evolution strategy
    Haiying Liu
    Kun Wu
    Kuihua Huang
    Guangquan Cheng
    Rui Wang
    Guohua Liu
    Cluster Computing, 2024, 27 : 515 - 529
  • [29] Swarm: Playground for Large-scale Decentralized Learning Simulations
    Lee, Sangsu
    Yu, Haoxiang
    Zheng, Xi
    Julien, Christine
    2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
  • [30] Algorithms or Actions? A Study in Large-Scale Reinforcement Learning
    Tavares, Anderson Rocha
    Anbalagan, Sivasubramanian
    Marcolino, Leandro Soriano
    Chaimowicz, Luiz
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2717 - 2723