Study on UAV obstacle avoidance algorithm based on deep recurrent double Q network

被引：0

作者：

Wei Y. ^{[1
]}

Liu Z. ^{[2
]}

Cai B. ^{[3
,4
]}

Chen J. ^{[3
,4
]}

Yang Y. ^{[5
]}

Zhang K. ^{[5
]}

机构：

[1] School of Astronautics, Northwestern Polytechnical University, Xi'an

[2] The Third Military Representative Office of Beijing Military Representative, Office of Air Force Equipment Department in Tianjin, Tianjin

[3] Shanghai Aerospace Control Technology Institute, Shanghai

[4] Infrared Detection Technology R & D Center of China Aerospace Science and Technology Corporation, Shanghai

[5] Unmanned System Research Institute, Northwestern Polytechnical University, Xi'an

来源：

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University | 2022年 / 40卷 / 05期

关键词：

DDQN; deep reinforcement learning; obstacle avoidance; recurrent neural network; UAV;

D O I：

10.1051/jnwpu/20224050970

中图分类号：

学科分类号：

摘要：

The traditional reinforcement learning method has the problems of overestimation of value function and partial observability in the field of machine motion planning, especially in the obstacle avoidance problem of UAV, which lead to long training time and difficult convergence in the process of network training. This paper proposes an obstacle avoidance algorithm for UAVs based on a deep recurrent double Q network. By transforming the single-network structure into a dual-network structure, the optimal action selection and action value estimation are decoupled to reduce the overestimation of the value function. The fully connected layer introduces the GRU recurrent neural network module, and uses the GRU to process the time dimension information, enhance the analyzability of the real neural network, and improve the performance of the algorithm in some observable environments. On this basis, combining with the priority experience playback mechanism, the network convergence is accelerated. Finally, the original algorithm and the improved algorithm are tested in the simulation environment. The experimental results show that the algorithm has better performance in terms of training time, obstacle avoidance success rate and robustness. ©2022 Journal of Northwestern Polytechnical University.

引用

页码：970 / 979

页数：9

共 18 条

[1] LYU Qian, TAO Peng, WU Hong, Et al., Autonomous flight and obstacle avoidance of mav in indoor environment, Software Guide, 20, 2, pp. 114-118, (2021)
[2] LAI Wugang, LIU Zongwen, JI Kaifei, Et al., System design of autonomous patrol UAV based on lidar, Microcontrollers and Embedded System Applications, 20, 5, (2020)
[3] LI Bohao, LUO Yonghan, PENG Keqin, Design of indoor mobile robot based on laser SLAM, Digital World, 12, pp. 16-18, (2020)
[4] YANG Wei, ZHU Wenqiu, ZHANG Changlong, UAV rapid and autonomous obstacle avoidance based on RGB-D camera, Journal of Hunan University of Technology, 29, 6, (2015)
[5] GIUSTI A, GUZZI J, DAN C C, Et al., A machine learning approach to visual perception of forest trails for mobile robots, IEEE Robotics & Automation Letters, 1, 2, pp. 661-667, (2017)
[6] HADSELL R, ERKAN A, SERMANET P, Et al., Deep belief net learning in a long-range vision system for autonomous off-road driving, Proceedings of the 2008 IEEE Conference on Intelligent Robots and Systems, pp. 628-633, (2008)
[7] XING Guansheng, ZHANG Kaiwen, DU Chunyan, Mobile robot obstacle avoidance algorithm based on deep reinforcement learning, The 30th China Process Control Conference, (2019)
[8] MNIH V, KAVUKCUOGLU K, SILVER D, Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, (2015)
[9] KULKARNI T D, NARASIMHAN K R, SAEEDI A, Et al., Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic
[10] SILVER D, HUANG A, MADDISON C J, Et al., Mastering the game of Go with deep neural networks and tree search, Nature, 529, 7587, pp. 484-489, (2016)

← 1 2 →