Terminal Guidance Law Design Based on DDPG Algorithm

被引:0
|
作者
Liu Y. [1 ]
He Z.-Z. [1 ]
Wang C.-Y. [1 ]
Guo M.-Z. [2 ]
机构
[1] School of Computer Science and Technology, Harbin Institute of Technology, Harbin
[2] School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing
来源
关键词
Deterministic policy; Inductive bias; Reinforcement learning; Terminal guidance law;
D O I
10.11897/SP.J.1016.2021.01854
中图分类号
学科分类号
摘要
The design of terminal guidance law is the key technology in interception system. The performance of the commonly used proportional guidance law and its variants will degrade under the condition of a large maneuvering target and will be affected by the navigation ratio. A terminal guidance law design method based on the DDPG algorithm is proposed. By designing the environment state and action (control quantity) of interception problem, the guidance law with optimal learning reward from the interactive data of simulation environment is realized. Compared with the traditional method, the model-free method is more flexible. Aiming at the problem of low training efficiency caused by weak hypothesis bias of action set in reinforcement learning method, a further proposal is proposed taking the navigation ratio as the decision optimization parameter, the training process is accelerated and the navigation ratio in proportional guidance law is adjusted dynamically. The comparative experiments show that the two design methods of terminal guidance law based on reinforcement learning obtain better interception effect than proportional guidance law and its variants, showing good research prospects and potential application value. © 2021, Science Press. All right reserved.
引用
下载
收藏
页码:1854 / 1865
页数:11
相关论文
共 26 条
  • [1] Madany Y M, El-Badawy A, Soliman A M., Optimal proportional navigation guidance using pseudo sensor enhancement method(PSEM) for flexible interceptor applications, Proceedings of the International Conference on Computer Modelling and Simulation, pp. 372-377, (2016)
  • [2] Yanushevsky R., Modern Missile Guidance, (2008)
  • [3] Nesline N, Zarchan P., Why modern controllers can go unstable in practice, Journal of Guidance, 7, 4, pp. 495-500, (1984)
  • [4] Ulybyshev Y., Terminal guidance law based on proportional navigation, Journal of Guidance Control & Dynamics, 28, 4, pp. 821-824, (2005)
  • [5] Sutton R S, Barto A G., Reinforcement Learning: An Introduction (2nd Edition), (2018)
  • [6] Mnih V, Kavukcuoglu K, Silver D, Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
  • [7] Lange S, Riedmiller M, Voigtlander A., Autonomous reinforcement learning on raw visual input data in a real-world application, Proceedings of the International Joint Conference on Neural Networks, pp. 1-8, (2012)
  • [8] O'Kelly M, Sinha A, Namkoong H, Et al., Scalable end-to-end autonomous vehicle testing via rare-event simulation, Proceedings of the 32nd Neural Information Processing Systems, pp. 9849-9860, (2018)
  • [9] Li D, Zhao D, Zhang Q, Et al., Reinforcement learning and deep learning-based lateral control for autonomous driving, IEEE Computational Intelligence Magazine, 14, pp. 83-98, (2018)
  • [10] Zhu Y, Mottaghi R, Kolve E, Et al., Target-driven visual navigation in indoor scenes using deep reinforcement learning, Proceedings of the IEEE International Conference on Robotics and Automation, pp. 3357-3364, (2017)