Cross coordination of behavior clone and reinforcement learning for autonomous within-visual-range air combat

被引:0
|
作者
Li, Lun [1 ,2 ]
Zhang, Xuebo [1 ]
Qian, Chenxu [1 ,2 ]
Zhao, Minghui [1 ,2 ]
Wang, Runhua [1 ,2 ]
机构
[1] Nankai Univ, Inst Robot & Automat Informat Syst, Coll Artificial Intelligence, Tianjin, Peoples R China
[2] Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
WVR air combat; Fixed-wing plane; Behavior clone; PPO; IMITATION; LEVEL; GAME;
D O I
10.1016/j.neucom.2024.127591
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a novel hierarchical framework to resolve within -visual -range (WVR) air-to-air combat under complex nonlinear 6 degrees -of -freedom (6-DOF) dynamics of the aircraft and missile. The decision process is constructed with two layers from the top to the bottom and adopts reinforcement learning to solve them separately. The top layer designs a new combat policy to decide the autopilot commands (such as the target heading, velocity, and altitude) and missile launch according to the current combat situation. Then the bottom layer uses a control policy to answer the autopilot commands by calculating the actual input signals (deflections of the rudder, elevator, aileron, and throttle) for the aircraft. For the combat policy, we present a new learning method called "E2L"that can mimic the knowledge of the expert under the two -layer decision frame to inspire the intelligence of the agent in the early stage of training. This method establishes a cross coordination of behavior clone (BC) and proximal policy optimization (PPO). Under the mechanism, the agent is alternately updated around the latest strategy, using BC with gradient clipping and PPO with Kullback-Leibler divergence loss and the modified BC demonstration trajectories, which can learn competitive combat strategies more stably and quickly. Sufficient experimental results show that the proposed method can achieve better combat performance than the baselines.
引用
收藏
页数:13
相关论文
共 35 条
  • [1] A New Situation Assessment Model for Modern Within-Visual-Range Air Combat
    Zhou Siyu
    Wu Wenhai
    Zhou Shengming
    Qu Zhigang
    [J]. 2012 INTERNATIONAL WORKSHOP ON INFORMATION AND ELECTRONICS ENGINEERING, 2012, 29 : 339 - 343
  • [2] Autonomous Agent for Beyond Visual Range Air Combat: A Deep Reinforcement Learning Approach
    Dantas, Joao P. A.
    Maximo, Marcos R. O. A.
    Yoneyama, Takashi
    [J]. PROCEEDINGS OF THE 2023 ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACMSIGSIM-PADS 2023, 2023, : 48 - 49
  • [3] Control Allocation for an Over-Actuated Aircraft Based on Within-Visual-Range Air Combat Agility
    Liu, Yan
    Gao, Zhenghong
    Shang, Chongyang
    [J]. IEEE ACCESS, 2018, 6 : 14668 - 14675
  • [4] Explainable Basic-Fighter-Maneuver Decision Support Scheme for Piloting Within-Visual-Range Air Combat
    Wang, Can
    Tu, Jingqi
    Yang, Xizhong
    Yao, Jun
    Xue, Tao
    Ma, Jinyi
    Zhang, Yiming
    Ai, Jianliang
    Dong, Yiqun
    [J]. JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2024, 21 (06): : 500 - 514
  • [5] Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning
    Kong, Weiren
    Zhou, Deyun
    Zhang, Kai
    Yang, Zhen
    [J]. 2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 506 - 512
  • [6] Research on Autonomous Manoeuvre Decision Making in Within-Visual-Range Aerial Two-Player Zero-Sum Games Based on Deep Reinforcement Learning
    Lu, Bo
    Ru, Le
    Hu, Shiguang
    Wang, Wenfei
    Xi, Hailong
    Zhao, Xiaolin
    [J]. MATHEMATICS, 2024, 12 (14)
  • [7] Application of Deep Reinforcement Learning in Maneuver Planning of Beyond-Visual-Range Air Combat
    Hu, Dongyuan
    Yang, Rennong
    Zuo, Jialiang
    Zhang, Ze
    Wu, Jun
    Wang, Ying
    [J]. IEEE ACCESS, 2021, 9 : 32282 - 32297
  • [8] Beyond-Visual-Range Air Combat Tactics Auto-Generation by Reinforcement Learning
    Piao, Haiyin
    Sun, Zhixiao
    Meng, Guanglei
    Chen, Hechang
    Qu, Bohao
    Lang, Kuijun
    Sun, Yang
    Yang, Shengqi
    Peng, Xuanqi
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] Reinforcement Learning for Multiaircraft Autonomous Air Combat in Multisensor UCAV Platform
    Kong, Weiren
    Zhou, Deyun
    Du, Yongjie
    Zhou, Ying
    Zhao, Yiyang
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (18) : 20596 - 20606
  • [10] Autonomous Air Combat with Reinforcement Learning under Different Noise Conditions
    Tasbas, Ahmet Semih
    Serbest, Sanberk
    Sahin, Safa Onur
    Ure, Nazim Kemal
    [J]. 2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,