Weighted mean field reinforcement learning for large-scale UAV swarm confrontation

被引:17
|
作者
Wang, Baolai [1 ]
Li, Shengang [1 ]
Gao, Xianzhong [2 ]
Xie, Tao [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Aerosp Sci & Engn, Changsha 410073, Hunan, Peoples R China
关键词
Multi-agent reinforcement learning; Unmanned aerial vehicle; Swarm confrontation; Attention mechanism; LEVEL;
D O I
10.1007/s10489-022-03840-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finding the optimal game strategy is a difficult problem in unmanned aerial vehicle (UAV) swarm confrontation. As an effective solution to the sequential decision-making problem, multi-agent reinforcement learning (MARL) provides a promising way to realize intelligent countermeasures. However, there are two challenges in applying MARL to large-scale UAV swarm confrontation: i) the curse of dimensionality caused by the excessive scale of UAV clusters and ii) the generalization problem caused by the dynamically changing UAV cluster size. To address these problems, we propose a novel MARL paradigm, called Weighted Mean Field Reinforcement Learning, where the pairwise communication between any UAV and its neighbors is modeled as that between a central UAV and the virtual UAV, which is abstracted from the weighted mean effect of neighboring UAVs. This approach reduces the multi-agent problem to a two-agent problem, which can reduce the input dimension of the agent and adapt to the changing cluster size. The communication content between UAVs includes actions and local observations. Actions can enhance the cooperation between UAVs and alleviate the non-stationarity of the environment, while local observations can expand the perception range of the central UAV so that it can obtain more useful information about the environment. The attention mechanism is leveraged to enable UAVs to select more valuable information flexibly, making our method more scalable than other algorithms. Combining this paradigm with double Q-learning and actor-critic algorithms, we propose weighted mean field Q-learning (WMFQ) and weighted mean field actor-critic (WMFAC) algorithms. Experiments on our constructed UAV swarm confrontation environment verify the effectiveness and scalability of our algorithms.
引用
收藏
页码:5274 / 5289
页数:16
相关论文
共 50 条
  • [1] Weighted mean field reinforcement learning for large-scale UAV swarm confrontation
    Baolai Wang
    Shengang Li
    Xianzhong Gao
    Tao Xie
    Applied Intelligence, 2023, 53 : 5274 - 5289
  • [2] A Large-Scale UAV Swarm Confrontation Method Based on Fuzzy Reinforcement Learning
    Hu, Chunyang
    Li, Jingchen
    Yang, Yusen
    Gu, Qiong
    Wu, Zhao
    Ning, Bin
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2025,
  • [3] A Weighted Mean Field Reinforcement Learning Algorithm for Large-Scale Multi-Agent Collaboration
    Xinwei Yuan
    He Wang
    Wenwu Yu
    Guidance,Navigation and Control, 2023, (02) : 42 - 60
  • [4] UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning
    Wang, Baolai
    Li, Shengang
    Gao, Xianzhong
    Xie, Tao
    INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2021, 2021
  • [5] Large-scale UAV swarm confrontation based on hierarchical attention actor-critic algorithm
    Xiaohong Nian
    Mengmeng Li
    Haibo Wang
    Yalei Gong
    Hongyun Xiong
    Applied Intelligence, 2024, 54 : 3279 - 3294
  • [6] Hierarchical Mean-Field Deep Reinforcement Learning for Large-Scale Multiagent Systems
    Yu, Chao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11744 - 11752
  • [7] Large-scale UAV swarm confrontation based on hierarchical attention actor-critic algorithm
    Nian, Xiaohong
    Li, Mengmeng
    Wang, Haibo
    Gong, Yalei
    Xiong, Hongyun
    APPLIED INTELLIGENCE, 2024, 54 (04) : 3279 - 3294
  • [8] Collaborative decision-making for UAV swarm confrontation based on reinforcement learning
    Jiao, Yongkang
    Fu, Wenxing
    Cao, Xinying
    Su, Qiangqing
    Wang, Yusheng
    Shen, Zixiang
    Yu, Lanlin
    IET CONTROL THEORY AND APPLICATIONS, 2025, 19 (01):
  • [9] UAV Swarm Confrontation Based on Multi-agent Deep Reinforcement Learning
    Wang, Zhi
    Liu, Fan
    Guo, Jing
    Hong, Chen
    Chen, Ming
    Wang, Ershen
    Zhao, Yunbo
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4996 - 5001
  • [10] Transition-Informed Reinforcement Learning for Large-Scale Stackelberg Mean-Field Games
    Li, Pengdeng
    Yu, Runsheng
    Wang, Xinrun
    An, Bo
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17469 - 17476