High-Speed Three-Dimensional Aerial Vehicle Evasion Based on a Multi-Stage Dueling Deep Q-Network
被引:0
|
作者:
Yang, Yefeng
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Hong Kong Polytech Univ, Dept Aeronaut & Aviat Engn, Hong Kong, Peoples R ChinaHarbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Yang, Yefeng
[1
,2
]
Huang, Tao
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Hong Kong Polytech Univ, Dept Aeronaut & Aviat Engn, Hong Kong, Peoples R ChinaHarbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Huang, Tao
[1
,2
]
Wang, Xinxin
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R ChinaHarbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Wang, Xinxin
[1
]
Wen, Chih-Yung
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Aeronaut & Aviat Engn, Hong Kong, Peoples R ChinaHarbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Wen, Chih-Yung
[2
]
Huang, Xianlin
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R ChinaHarbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
Huang, Xianlin
[1
]
机构:
[1] Harbin Inst Technol, Ctr Control Theory & Guidance Technol, Harbin 150001, Peoples R China
[2] Hong Kong Polytech Univ, Dept Aeronaut & Aviat Engn, Hong Kong, Peoples R China
aerial vehicle evasion;
deep reinforcement learning;
dueling deep Q-network;
multi-stage training;
DIFFERENTIAL GAME;
GUIDANCE LAW;
PURSUERS;
MANEUVER;
EQUATION;
EVADERS;
D O I:
10.3390/aerospace9110673
中图分类号:
V [航空、航天];
学科分类号:
08 ;
0825 ;
摘要:
This paper proposes a multi-stage dueling deep Q-network (MS-DDQN) algorithm to address the high-speed aerial vehicle evasion problem. High-speed aerial vehicle pursuit and evasion are an ongoing game attracting significant research attention in the field of autonomous aerial vehicle decision making. However, traditional maneuvering methods are usually not applicable in high-speed scenarios. Independent of the aerial vehicle model, the implemented MS-DDQN-based method searches for an approximate optimal maneuvering policy by iteratively interacting with the environment. Furthermore, the multi-stage learning mechanism was introduced to improve the training data quality. Simulation experiments were conducted to compare the proposed method with several typical evasion maneuvering policies and to reveal the effectiveness and robustness of the proposed MS-DDQN algorithm.