Maneuver Strategy Generation of UCAV for within Visual Range Air Combat Based on Multi-Agent Reinforcement Learning and Target Position Prediction

被引:29
|
作者
Kong, Weiren [1 ]
Zhou, Deyun [1 ]
Yang, Zhen [1 ]
Zhang, Kai [1 ]
Zeng, Lina [1 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710072, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 15期
基金
中国国家自然科学基金;
关键词
air combat; multi-agent deep reinforcement learning; maneuver strategy; network training; unmanned combat aerial vehicle;
D O I
10.3390/app10155198
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With the development of unmanned combat air vehicles (UCAVs) and artificial intelligence (AI), within visual range (WVR) air combat confrontations utilizing intelligent UCAVs are expected to be widely used in future air combats. As controlling highly dynamic and uncertain WVR air combats from the ground stations of the UCAV is not feasible, it is necessary to develop an algorithm that can generate highly intelligent air combat strategies in order to enable UCAV to independently complete air combat missions. In this paper, a 1-vs.-1 WVR air combat strategy generation algorithm is proposed using the multi-agent deep deterministic policy gradient (MADDPG). A 1-vs.-1 WVR air combat is modeled as a two-player zero-sum Markov game (ZSMG). A method for predicting the position of the target is introduced into the model in order to enable the UCAV to predict the target's actions and position. Moreover, to ensure that the UCAV is not limited by the constraints of the basic fighter maneuver (BFM) library, the action space is considered to be a continuous one. At the same time, a potential-based reward shaping method is proposed in order to improve the efficiency of the air combat strategy generation algorithm. Finally, the efficiency of the air combat strategy generation algorithm and the intelligence level of the resulting strategy is verified through simulation experiments. The results show that an air combat strategy using target position prediction is superior to the one that does not use target position prediction.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition
    Sheng, Lei
    Chen, Honghui
    Chen, Xiliang
    ALGORITHMS, 2024, 17 (12)
  • [42] Multi-Agent Reinforcement Learning Based File Caching Strategy in Mobile Edge Computing
    Yang, Yongjian
    Lou, Kaihao
    Wang, En
    Liu, Wenbin
    Shang, Jianwen
    Song, Xueting
    Li, Dawei
    Wu, Jie
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (06) : 3159 - 3174
  • [43] Load-Aware Satellite Handover Strategy Based on Multi-Agent Reinforcement Learning
    He, Shuxin
    Wang, Tianyu
    Wang, Shaowei
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [44] Ship cooperative collision avoidance strategy based on multi-agent deep reinforcement learning
    Sui L.-R.
    Gao S.
    He W.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (05): : 1395 - 1402
  • [45] Smart generation control based on multi-agent reinforcement learning with the idea of the time tunnel
    Xi, Lei
    Chen, Jianfeng
    Huang, Yuehua
    Xu, Yanchun
    Liu, Lang
    Zhou, Yimin
    Li, Yudan
    ENERGY, 2018, 153 : 977 - 987
  • [46] Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning
    Jiawei Xia
    Yasong Luo
    Zhikun Liu
    Yalun Zhang
    Haoran Shi
    Zhong Liu
    Defence Technology, 2023, 29 (11) : 80 - 94
  • [47] Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning
    Xia, Jiawei
    Luo, Yasong
    Liu, Zhikun
    Zhang, Yalun
    Shi, Haoran
    Liu, Zhong
    DEFENCE TECHNOLOGY, 2023, 29 : 80 - 94
  • [48] Multi-ship collaborative collision avoidance strategy based on multi-agent deep reinforcement learning
    Huang R.
    Luo L.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2024, 30 (06): : 1972 - 1988
  • [49] A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat
    Liu, Xiaoxiong
    Yin, Yi
    Su, Yuzhan
    Ming, Ruichen
    AEROSPACE, 2022, 9 (10)
  • [50] Multi-Objective Optimization in Air-to-Air Communication System Based on Multi-Agent Deep Reinforcement Learning
    Lin, Shaofu
    Chen, Yingying
    Li, Shuopeng
    SENSORS, 2023, 23 (23)