A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat

被引:24
|
作者
Chai, Jiajun [1 ,2 ]
Chen, Wenzhang [1 ,2 ]
Zhu, Yuanheng [1 ,2 ]
Yao, Zong-Xin [3 ]
Zhao, Dongbin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Shenyang Aircraft Design & Res Inst, Dept Unmanned Aerial Vehicle, Shenyang 110035, Peoples R China
基金
中国国家自然科学基金;
关键词
Aircraft; Aerospace control; 6-DOF; Task analysis; Nose; Missiles; Heuristic algorithms; 6-DOF unmanned combat air vehicle (UCAV); air combat; hierarchical structure; reinforcement learning (RL); self-play; LEVEL; GAME;
D O I
10.1109/TSMC.2023.3270444
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unmanned combat air vehicle (UCAV) combat is a challenging scenario with high-dimensional continuous state and action space and highly nonlinear dynamics. In this article, we propose a general hierarchical framework to resolve the within-vision-range (WVR) air-to-air combat problem under six dimensions of degree (6-DOF) dynamics. The core idea is to divide the whole decision-making process into two loops and use reinforcement learning (RL) to solve them separately. The outer loop uses a combat policy to decide the macro command according to the current combat situation. Then the inner loop uses a control policy to answer the macro command by calculating the actual input signals for the aircraft. We design the Markov decision-making process for the control policy and the Markov game between two aircraft. We present a two-stage training mechanism. For the control policy, we design an effective reward function to accurately track various macro behaviors. For the combat policy, we present a fictitious self-play mechanism to improve the combat performance by combating against the historical combat policies. Experiment results show that the control policy can achieve better tracking performance than conventional methods. The fictitious self-play mechanism can learn competitive combat policy, which can achieve high winning rates against conventional methods.
引用
收藏
页码:5417 / 5429
页数:13
相关论文
共 50 条
  • [21] Air Combat Maneuver Decision Based on Deep Reinforcement Learning and Game Theory
    Yin, Shuhui
    Kang, Yu
    Zhao, Yunbo
    Xue, Jian
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6939 - 6943
  • [22] Air combat maneuver decision based on deep reinforcement learning with auxiliary reward
    Zhang T.
    Wang Y.
    Sun M.
    Chen Z.
    Neural Computing and Applications, 2024, 36 (21) : 13341 - 13356
  • [23] Deep Relationship Graph Reinforcement Learning for Multi-Aircraft Air Combat
    Han, Yue
    Piao, Haiyin
    Hou, Yaqing
    Sun, Yang
    Sun, Zhixiao
    Zhou, Deyun
    Yang, Shengqi
    Peng, Xuanqi
    Fan, Songyuan
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [24] Research on the Reward Design Method for Deep Reinforcement Learning in WVR Air Combat
    Zhang, Xin
    Dong, Wenhan
    Zhang, Pin
    Li, Dunwang
    IEEE ACCESS, 2024, 12 : 182693 - 182707
  • [25] Multi-Objective Optimization in Air-to-Air Communication System Based on Multi-Agent Deep Reinforcement Learning
    Lin, Shaofu
    Chen, Yingying
    Li, Shuopeng
    SENSORS, 2023, 23 (23)
  • [26] Modeling and simulation of air cushion vehicle 6-DOF maneuverability
    Ji, N. (jibaodong@163.com), 1600, Advanced Institute of Convergence Information Technology (06):
  • [27] Learning and Fast Adaptation for Air Combat Decision with Improved Deep Meta-reinforcement Learning
    Zhang, Pin
    Dong, Wenhan
    Cai, Ming
    Li, Dunwang
    Zhang, Xin
    INTERNATIONAL JOURNAL OF AERONAUTICAL AND SPACE SCIENCES, 2024,
  • [28] Smooth Path Planning of 6-DOF Robot Based on Reinforcement Learning
    Tian, Jiawei
    Li, Dazi
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 89 - 93
  • [29] Data Fusion of Air Combat Based on Reinforcement Learning
    Zhou, Tongle
    Chen, Mou
    Zou, Jie
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2019), 2019, : 492 - 497
  • [30] Air combat maneuver decision-making test based on deep reinforcement learning
    Zhang S.
    Zhou P.
    He Y.
    Huang J.
    Liu G.
    Tang J.
    Jia H.
    Du X.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2023, 44 (10):