Autonomous navigation based on PPO for mobile platform

被引:0
|
作者
Xu G. [1 ]
Xiong Y. [1 ]
Zhou B. [1 ]
Chen G. [1 ]
机构
[1] Key Laboratory of Autonomous Transportation Technology for Special Vehicles, Ministry of Industry and Information Technology, School of Transportation Science and Engineering, Beihang University, Beijing
基金
中国国家自然科学基金;
关键词
artificial potential field; autonomous navigation; mobile platform; proximal policy optimization algorithm; reinforcement learning;
D O I
10.13700/j.bh.1001-5965.2021.0100
中图分类号
学科分类号
摘要
This paper presents an autonomous navigation method based on proximal policy optimization (PPO) algorithm for mobile platform. In this method, GNSS and LADAR are used for sensing environment information. To define the state of reinforcement learning model, an ego position evaluation method is introduced based on improved artificial potential field algorithm. After that, on the basis of PPO algorithm, a kind of action policy function is designed based on Gaussian distribution, which solves the continuity problem of the vehicle linear velocity and yaw velocity. Furthermore, the network framework and reward function of the model are also designed for navigation scenarios. In order to train the navigation model, a virtual environment based on Gazebo is built. The training results show that the ego position evaluation method obviously helps to improve the speed of model convergence. Finally, the navigation model is transplanted to a real environment, which verifies the effectiveness of the proposed method. © 2022 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.
引用
收藏
页码:2138 / 2145
页数:7
相关论文
共 15 条
  • [1] WANG Y L., Research and implementation of autonomous navigation and obstacle avoidance system for ground unmanned platform, (2020)
  • [2] QIN S R., Research on laser sensor-based navigation system for mobile robots, (2020)
  • [3] HART P E, NILSSON N J, RAPHAEL B., A formal basis for the heuristic determination of minimum cost paths in graphs[ J], IEEE Transactions on Systems Science and Cybernetics, 4, 2, pp. 100-107, (1968)
  • [4] STENT A., Optimal and efficient path planning for partially-known environments[ C], Proceedings of IEEE International Conference on Robotics and Automation, 4, pp. 3310-3317, (1994)
  • [5] STENT A., The focussed D<sup>∗</sup> algorithm for real-time replanning [C], Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1652-1659, (1995)
  • [6] LAVALLE S M, KUFFNER J J., Randomized kinodynamic planning, Proceedings of IEEE International Conference on Robotics and Automation, 1, pp. 473-479, (1999)
  • [7] FU X J., Research on autonomous navigation of mobile robots based on reinforcement learning, (2017)
  • [8] YANG N B., Research on autonomous navigation of omnidirectional mode mobile robots for environmental detection, (2019)
  • [9] TAO R., Deep reinforcement learning-based navigation for mobile robots, (2020)
  • [10] HE C., Robot visual navigation algorithm based on deep reinforcement learning, (2021)