A proximal policy optimization based deep reinforcement learning framework for tracking control of a flexible robotic manipulator

被引:0
|
作者
Kumar, V. Joshi [1 ]
Elumalai, Vinodh Kumar [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamilnadu, India
关键词
Deep reinforcement learning; Proximal policy gradient; Policy feedback; Flexible joint manipulator; Vibration suppression;
D O I
10.1016/j.rineng.2025.104178
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper puts forward a policy feedback based deep reinforcement learning (DRL) control scheme for a partially observable system by leveraging the potentials of proximal policy optimization (PPO) algorithm and convolutional neural network (CNN). Although several DRL algorithms have been investigated for a fully observable system, there has been limited studies on devising a DRL control for a partially observable system with uncertain dynamics. Moreover, the major limitation of the existing policy gradient based DRL techniques is that they are computationally expensive and suffer from scalability issues for complex higher order systems. Hence, in this study, we adopt the PPO technique which utilizes first-order optimization to minimize the computational complexity and devise a DRL scheme for a partially observable flexible link robot manipulator system. Specifically, to improve the stability and convergence in PPO algorithm, this study adopts a collaborative policy approach in the update of value function and presents a collaborative proximal policy optimization (CPPO) algorithm that can address the tracking control and vibration suppression problems in partially observable robotic manipulator system. Identifying the optimal hyper-parameters of DRL using the grid search method, we exploit the capability of CNN in actor-critic architecture to extract the spatial dependencies in the state sequences of the dynamical system and boost the DRL performance. To improve the convergence of the proposed DRL algorithm, this study adopts the Lyapunov based reward shaping technique. The experimental validation on robotic manipulator system through hardware in loop (HIL) testing substantiates that the proposed framework offers faster convergence and better vibration suppression feature compared to the state-of-the-art policy gradient technique and actor-critic technique.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Trajectory Tracking Control Based on Deep Reinforcement Learning for a Robotic Manipulator with an Input Deadzone
    Wang, Fujie
    Hu, Jintao
    Qin, Yi
    Guo, Fang
    Jiang, Ming
    SYMMETRY-BASEL, 2025, 17 (02):
  • [2] Reactive Power Optimization Based on Proximal Policy Optimization of Deep Reinforcement Learning
    Zahng P.
    Zhu Z.
    Xie H.
    Dianwang Jishu/Power System Technology, 2023, 47 (02): : 562 - 570
  • [3] Control of Flexible Manipulator Based on Reinforcement Learning
    Cui, Leilei
    Chen, Weidong
    Wang, Hesheng
    Wang, Jingchuan
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 2744 - 2749
  • [4] RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control
    Xiang, Yanfei
    Wang, Xin
    Hu, Shu
    Zhu, Bin
    Huang, Xiaomeng
    Wu, Xi
    Lyu, Siwei
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 1207 - 1214
  • [5] Modular production control using deep reinforcement learning: proximal policy optimization
    Sebastian Mayer
    Tobias Classen
    Christian Endisch
    Journal of Intelligent Manufacturing, 2021, 32 : 2335 - 2351
  • [6] Modular production control using deep reinforcement learning: proximal policy optimization
    Mayer, Sebastian
    Classen, Tobias
    Endisch, Christian
    JOURNAL OF INTELLIGENT MANUFACTURING, 2021, 32 (08) : 2335 - 2351
  • [7] Vibration Control Based on Reinforcement Learning for a Single-link Flexible Robotic Manipulator
    Ouyang, Yuncheng
    He, Wei
    Li, Xiajing
    Liu, Jin-Kun
    Li, Guang
    IFAC PAPERSONLINE, 2017, 50 (01): : 3476 - 3481
  • [8] Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model
    Hu, Yazhou
    Wang, Wenxue
    Liu, Hao
    Liu, Lianqing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) : 3570 - 3578
  • [9] Deep reinforcement learning with reward shaping for tracking control and vibration suppression of flexible link manipulator
    Viswanadhapalli, Joshi Kumar
    Elumalai, Vinodh Kumar
    Shivram, S.
    Shah, Sweta
    Mahajan, Dhruv
    APPLIED SOFT COMPUTING, 2024, 152
  • [10] Reinforcement learning control of a single-link flexible robotic manipulator
    Ouyang, Yuncheng
    He, Wei
    Li, Xiajing
    IET CONTROL THEORY AND APPLICATIONS, 2017, 11 (09): : 1426 - 1433