Cooperative decision-making algorithm with efficient convergence for UCAV formation in beyond-visual-range air combat based on multi-agent reinforcement learning

被引:1
|
作者
Zhou, Yaoming [1 ]
Yang, Fan [1 ]
Zhang, Chaoyue [1 ]
Li, Shida [1 ]
Wang, Yongchao [2 ]
机构
[1] Beihang Univ, Sch Aeronaut Sci & Engn, Beijing 100191, Peoples R China
[2] Zhejiang Univ, Inst Cyber Syst & Control, Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Unmanned combat aerial vehicle (UCAV) formation; Decision-making; Beyond-visual-range (BVR) air combat; Advantage highlight; Multi-agent reinforcement learning (MARL);
D O I
10.1016/j.cja.2024.04.008
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Highly intelligent Unmanned Combat Aerial Vehicle (UCAV) formation is expected to bring out strengths in Beyond-Visual-Range (BVR) air combat. Although Multi-Agent Reinforcement Learning (MARL) shows outstanding performance in cooperative decision-making, it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed. Aiming to solve this problem, this paper proposes an Advantage Highlight MultiAgent Proximal Policy Optimization (AHMAPPO) algorithm. First, at every step, the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it. Then, the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency. Finally, the simulation results reveal that compared with some state-of-the-art MARL algorithms, the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper, which can reflect the critical features of BVR air combat. The AHMAPPO can significantly increase the convergence efficiency
引用
收藏
页码:311 / 328
页数:18
相关论文
共 50 条
  • [21] Multi-agent Planning for Fleet Cooperative Air Defense Decision-making
    Wang, Chao
    Liu, Peng
    2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 533 - 538
  • [22] Autonomous Agent for Beyond Visual Range Air Combat: A Deep Reinforcement Learning Approach
    Dantas, Joao P. A.
    Maximo, Marcos R. O. A.
    Yoneyama, Takashi
    PROCEEDINGS OF THE 2023 ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACMSIGSIM-PADS 2023, 2023, : 48 - 49
  • [23] An evolutionary multi-agent reinforcement learning algorithm for multi-UAV air combat
    Wang, Baolai
    Gao, Xianzhong
    Xie, Tao
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [24] Multi-UAV Cooperative Air Combat Decision-Making Based on Multi-Agent Double-Soft Actor-Critic
    Li, Shaowei
    Wang, Yongchao
    Zhou, Yaoming
    Jia, Yuhong
    Shi, Hanyue
    Yang, Fan
    Zhang, Chaoyue
    AEROSPACE, 2023, 10 (07)
  • [25] MO-MIX: Multi-Objective Multi-Agent Cooperative Decision-Making With Deep Reinforcement Learning
    Hu, Tianmeng
    Luo, Biao
    Yang, Chunhua
    Huang, Tingwen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12098 - 12112
  • [26] Multi-Dimensional Decision-Making for UAV Air Combat Based on Hierarchical Reinforcement Learning
    Zhang J.
    Wang D.
    Yang Q.
    Shi G.
    Lu Y.
    Zhang Y.
    Binggong Xuebao/Acta Armamentarii, 2023, 44 (06): : 1547 - 1563
  • [27] Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning
    Kong, Weiren
    Zhou, Deyun
    Zhang, Kai
    Yang, Zhen
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 506 - 512
  • [28] Decision-making method for air combat maneuver based on explainable reinforcement learning
    Yang, Shuheng
    Zhang, Dong
    Xiong, Wei
    Ren, Zhi
    Tang, Shuo
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2024, 45 (18):
  • [29] Air combat maneuver decision-making test based on deep reinforcement learning
    Zhang S.
    Zhou P.
    He Y.
    Huang J.
    Liu G.
    Tang J.
    Jia H.
    Du X.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2023, 44 (10):
  • [30] Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning
    Zuo, Jialiang
    Yang, Rennong
    Zhang, Ying
    Li, Zhonglin
    Wu, Meng
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2017, 38 (10):