Subtask-masked curriculum learning for reinforcement learning with application to UAV maneuver decision-making

被引:3
|
作者
Hou, Yueqi [1 ,2 ]
Liang, Xiaolong [1 ,2 ]
Lv, Maolong [1 ]
Yang, Qisong [3 ]
Li, Yang [3 ]
机构
[1] Air Force Engn Univ, Air Traff Control & Nav Sch, Xian, Peoples R China
[2] Air Force Engn Univ, Shaanxi Key Lab Meta Synth Elect & Informat Syst, Xian, Peoples R China
[3] Delft Univ Technol, Fac Elect Engn Math & Comp Sci, Delft, Netherlands
关键词
Unmanned Aerial Vehicle; Maneuver decision-making; Reinforcement learning; Curriculum learning; Knowledge transfer; STRATEGY;
D O I
10.1016/j.engappai.2023.106703
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unmanned Aerial Vehicle (UAV) maneuver strategy learning remains a challenge when using Reinforcement Learning (RL) in this sparse reward task. In this paper, we propose Subtask-Masked curriculum learning for RL (SubMas-RL), an efficient RL paradigm that implements curriculum learning and knowledge transfer for UAV maneuver scenarios involving multiple missiles. First, this study introduces a novel concept known as subtask mask to create source tasks from a target task by masking partial subtasks. Then, a subtask-masked curriculum generation method is proposed to generate a sequenced curriculum by alternately conducting task generation and task sequencing. To establish efficient knowledge transfer and avoid negative transfer, this paper employs two transfer techniques, policy distillation and policy reuse, along with an explicit transfer condition that masks irrelevant knowledge. Experimental results demonstrate that our method achieves a 94.8% success rate in the UAV maneuver scenario, where the direct use of reinforcement learning always fails. The proposed RL framework SubMas-RL is expected to learn an effective policy in complex tasks with sparse rewards.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] A Multiple-Attribute Decision-Making Approach to Reinforcement Learning
    Shi, Haobin
    Xu, Meng
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (04) : 695 - 708
  • [42] UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning
    ZHANG Jiandong
    YANG Qiming
    SHI Guoqing
    LU Yi
    WU Yong
    [J]. Journal of Systems Engineering and Electronics, 2021, 32 (06) : 1421 - 1438
  • [43] UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning
    Zhang Jiandong
    Yang Qiming
    Shi Guoqing
    Lu Yi
    Wu Yong
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2021, 32 (06) : 1421 - 1438
  • [44] Maneuver Decision of UAV in Short-Range Air Combat Based on Deep Reinforcement Learning
    Yang, Qiming
    Zhang, Jiandong
    Shi, Guoqing
    Hu, Jinwen
    Wu, Yong
    [J]. IEEE ACCESS, 2020, 8 : 363 - 378
  • [45] Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning
    Supancic, James, III
    Ramanan, Deva
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 322 - 331
  • [46] Intelligent Decision-Making for 3-Dimensional Dynamic Obstacle Avoidance of UAV Based on Deep Reinforcement Learning
    Han, Xiao
    Wang, Jing
    Xue, Jiayin
    Zhang, Qinyu
    [J]. 2019 11TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2019,
  • [47] An application of Hebbian learning in the design process decision-making
    Alberto Comesaña-Campos
    José Benito Bouza-Rodríguez
    [J]. Journal of Intelligent Manufacturing, 2016, 27 : 487 - 506
  • [48] An application of Hebbian learning in the design process decision-making
    Comesana-Campos, Alberto
    Benito Bouza-Rodriguez, Jose
    [J]. JOURNAL OF INTELLIGENT MANUFACTURING, 2016, 27 (03) : 487 - 506
  • [49] Learning Top-K Subtask Planning Tree Based on Discriminative Representation Pretraining for Decision-making
    Ruan, Jingqing
    Wang, Kaishen
    Zhang, Qingyang
    Xing, Dengpeng
    Xu, Bo
    [J]. MACHINE INTELLIGENCE RESEARCH, 2024, 21 (04) : 782 - 800
  • [50] Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation
    Hoel, Carl-Johan
    Wolff, Krister
    Laine, Leo
    [J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1563 - 1569