Cooperative decision-making algorithm with efficient convergence for UCAV formation in beyond-visual-range air combat based on multi-agent reinforcement learning

被引:1
|
作者
Zhou, Yaoming [1 ]
Yang, Fan [1 ]
Zhang, Chaoyue [1 ]
Li, Shida [1 ]
Wang, Yongchao [2 ]
机构
[1] Beihang Univ, Sch Aeronaut Sci & Engn, Beijing 100191, Peoples R China
[2] Zhejiang Univ, Inst Cyber Syst & Control, Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Unmanned combat aerial vehicle (UCAV) formation; Decision-making; Beyond-visual-range (BVR) air combat; Advantage highlight; Multi-agent reinforcement learning (MARL);
D O I
10.1016/j.cja.2024.04.008
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Highly intelligent Unmanned Combat Aerial Vehicle (UCAV) formation is expected to bring out strengths in Beyond-Visual-Range (BVR) air combat. Although Multi-Agent Reinforcement Learning (MARL) shows outstanding performance in cooperative decision-making, it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed. Aiming to solve this problem, this paper proposes an Advantage Highlight MultiAgent Proximal Policy Optimization (AHMAPPO) algorithm. First, at every step, the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it. Then, the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency. Finally, the simulation results reveal that compared with some state-of-the-art MARL algorithms, the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper, which can reflect the critical features of BVR air combat. The AHMAPPO can significantly increase the convergence efficiency
引用
收藏
页码:311 / 328
页数:18
相关论文
共 50 条
  • [1] Cooperative decision-making algorithm with beyond-visual-range air combat based on multi-agent reinforcement learning
    Yaoming ZHOU
    Fan YANG
    Chaoyue ZHANG
    Shida LI
    Yongchao WANG
    Chinese Journal of Aeronautics, 2024, 37 (08) : 311 - 328
  • [2] A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat
    Liu, Xiaoxiong
    Yin, Yi
    Su, Yuzhan
    Ming, Ruichen
    AEROSPACE, 2022, 9 (10)
  • [3] A Multi-UCAV cooperative occupation method based on weapon engagement zones for beyond-visual-range air combat
    Wei-hua Li
    Jing-ping Shi
    Yun-yan Wu
    Yue-ping Wang
    Yong-xi Lyu
    Defence Technology, 2022, 18 (06) : 1006 - 1022
  • [4] A Multi-UCAV cooperative occupation method based on weapon engagement zones for beyond-visual-range air combat
    Li, Wei-hua
    Shi, Jing-ping
    Wu, Yun-yan
    Wang, Yue-ping
    Lyu, Yong-xi
    DEFENCE TECHNOLOGY, 2022, 18 (06) : 1006 - 1022
  • [5] Cooperative Occupancy Decision Making of Multi-UAV in Beyond-Visual-Range Air Combat: A Game Theory Approach
    Ma, Yingying
    Wang, Guoqiang
    Hu, Xiaoxuan
    Luo, He
    Lei, Xing
    IEEE ACCESS, 2020, 8 : 11624 - 11634
  • [6] UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning
    ZHANG Jiandong
    YANG Qiming
    SHI Guoqing
    LU Yi
    WU Yong
    Journal of Systems Engineering and Electronics, 2021, 32 (06) : 1421 - 1438
  • [7] UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning
    Zhang Jiandong
    Yang Qiming
    Shi Guoqing
    Lu Yi
    Wu Yong
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2021, 32 (06) : 1421 - 1438
  • [8] Evasive Maneuver Strategy for UCAV in Beyond-Visual-Range Air Combat Based on Hierarchical Multi-Objective Evolutionary Algorithm
    Yang, Zhen
    Zhou, Deyun
    Piao, Haiyin
    Zhang, Kai
    Kong, Weiren
    Pan, Qian
    IEEE ACCESS, 2020, 8 : 46605 - 46623
  • [9] A cooperative jamming decision-making method based on multi-agent reinforcement learning
    Bingchen Cai
    Haoran Li
    Naimin Zhang
    Mingyu Cao
    Han Yu
    Autonomous Intelligent Systems, 5 (1):
  • [10] Maneuver Strategy Generation of UCAV for within Visual Range Air Combat Based on Multi-Agent Reinforcement Learning and Target Position Prediction
    Kong, Weiren
    Zhou, Deyun
    Yang, Zhen
    Zhang, Kai
    Zeng, Lina
    APPLIED SCIENCES-BASEL, 2020, 10 (15):