An Efficiently Convergent Deep Reinforcement Learning-Based Trajectory Planning Method for Manipulators in Dynamic Environments

被引:0
|
作者
Li Zheng
YaHao Wang
Run Yang
Shaolei Wu
Rui Guo
Erbao Dong
机构
[1] University of Science and Technology of China,CAS Key Laboratory of Mechanical Behavior and Design of Materials, Department of Precision Machinery and Precision Instrumentation
[2] State Grid Anhui Electric Power Company Electric Power Research Institute,undefined
[3] State Grid Intelligent Technology Co,undefined
来源
关键词
Manipulator trajectory planning; Deep reinforcement learning; Autonomous navigation; Real-time obstacle avoidance; Dynamic action selection strategy; Combinatorial reward function;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, deep reinforcement learning (DRL)-based trajectory planning methods have been designed for manipulator trajectory planning, given their potential in solving the problem of multidimensional spatial trajectory planning. However, many DRL models that have been proposed for manipulators working in dynamic environments face difficulties in obtaining the optimal strategy, thereby preventing them from reaching convergence because of massive ineffective exploration and sparse rewards. In this paper, we solve the inefficient convergence problem at the two levels of the action selection strategy and reward functions. First, this paper designs a dynamic action selection strategy that has a high probability of providing positive samples in the pre-training period by using a variable guide item and effectively reduces invalid exploration. Second, this study proposes a combinatorial reward function that combines the artificial potential field method with a time-energy function, thereby greatly improving the efficiency and stability of DRL-based methods for manipulators trajectory planning in dynamic working environments. Extensive experiments are conducted using the CoppeliaSim simulation model with a freely moving obstacle and the 6-DOF manipulator. The results show that the proposed dynamic action selection strategy and combinatorial reward function can improve the convergence rate on the DDPG, TD3, and SAC DRL algorithms by up to 3-5 times. Furthermore, the mean value of the reward function increases by up to 1.47-2.70 times, and the standard deviation decreases by 27.56% to 56.60%.
引用
收藏
相关论文
共 50 条
  • [1] An Efficiently Convergent Deep Reinforcement Learning-Based Trajectory Planning Method for Manipulators in Dynamic Environments
    Zheng, Li
    Wang, YaHao
    Yang, Run
    Wu, Shaolei
    Guo, Rui
    Dong, Erbao
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2023, 107 (04)
  • [2] Deep reinforcement learning-based reactive trajectory planning method for UAVs
    Cao, Lijia
    Wang, Lin
    Liu, Yang
    Xu, Weihong
    Geng, Chuang
    [J]. PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (10) : 1018 - 1037
  • [3] Reinforcement Learning-based Adaptive Trajectory Planning for AUVs in Under-ice Environments
    Wang, Chaofeng
    Wei, Li
    Wang, Zhaohui
    Song, Min
    Mahmoudian, Nina
    [J]. OCEANS 2018 MTS/IEEE CHARLESTON, 2018,
  • [4] Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions
    Wang, Xusheng
    Xie, Jiexin
    Guo, Shijie
    Li, Yue
    Sun, Pengfei
    Gan, Zhongxue
    [J]. ADVANCES IN MECHANICAL ENGINEERING, 2021, 13 (12)
  • [5] Deep Reinforcement Learning-Based 3D Trajectory Planning for Cellular Connected UAV
    Liu, Xiang
    Zhong, Weizhi
    Wang, Xin
    Duan, Hongtao
    Fan, Zhenxiong
    Jin, Haowen
    Huang, Yang
    Lin, Zhipeng
    [J]. DRONES, 2024, 8 (05)
  • [6] Deep Reinforcement Learning-Based Path Planning for Multi-Arm Manipulators with Periodically Moving Obstacles
    Prianto, Evan
    Park, Jae-Han
    Bae, Ji-Hun
    Kim, Jung-Su
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (06):
  • [7] Online Trajectory Planning Method for Midcourse Guidance Phase Based on Deep Reinforcement Learning
    Li, Wanli
    Li, Jiong
    Li, Ningbo
    Shao, Lei
    Li, Mingjie
    [J]. AEROSPACE, 2023, 10 (05)
  • [8] A Learning-Based Model Predictive Trajectory Planning Controller for Automated Driving in Unstructured Dynamic Environments
    Li, Zhiyuan
    Zhao, Pan
    Jiang, Chunmao
    Huang, Weixin
    Liang, Huawei
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (06) : 5944 - 5959
  • [9] A Deep Reinforcement Learning-Based Dynamic Computational Offloading Method for Cloud Robotics
    Penmetcha, Manoj
    Min, Byung-Cheol
    [J]. IEEE Access, 2021, 9 : 60265 - 60279
  • [10] A Deep Reinforcement Learning-Based Dynamic Computational Offloading Method for Cloud Robotics
    Penmetcha, Manoj
    Min, Byung-Cheol
    [J]. IEEE ACCESS, 2021, 9 : 60265 - 60279