A proximal policy optimization based deep reinforcement learning framework for tracking control of a flexible robotic manipulator

被引:0
|
作者
Kumar, V. Joshi [1 ]
Elumalai, Vinodh Kumar [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamilnadu, India
关键词
Deep reinforcement learning; Proximal policy gradient; Policy feedback; Flexible joint manipulator; Vibration suppression;
D O I
10.1016/j.rineng.2025.104178
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper puts forward a policy feedback based deep reinforcement learning (DRL) control scheme for a partially observable system by leveraging the potentials of proximal policy optimization (PPO) algorithm and convolutional neural network (CNN). Although several DRL algorithms have been investigated for a fully observable system, there has been limited studies on devising a DRL control for a partially observable system with uncertain dynamics. Moreover, the major limitation of the existing policy gradient based DRL techniques is that they are computationally expensive and suffer from scalability issues for complex higher order systems. Hence, in this study, we adopt the PPO technique which utilizes first-order optimization to minimize the computational complexity and devise a DRL scheme for a partially observable flexible link robot manipulator system. Specifically, to improve the stability and convergence in PPO algorithm, this study adopts a collaborative policy approach in the update of value function and presents a collaborative proximal policy optimization (CPPO) algorithm that can address the tracking control and vibration suppression problems in partially observable robotic manipulator system. Identifying the optimal hyper-parameters of DRL using the grid search method, we exploit the capability of CNN in actor-critic architecture to extract the spatial dependencies in the state sequences of the dynamical system and boost the DRL performance. To improve the convergence of the proposed DRL algorithm, this study adopts the Lyapunov based reward shaping technique. The experimental validation on robotic manipulator system through hardware in loop (HIL) testing substantiates that the proposed framework offers faster convergence and better vibration suppression feature compared to the state-of-the-art policy gradient technique and actor-critic technique.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Proximal Policy Optimization Based Control Framework for Flexible Battery Energy Storage System
    Meng, Jinhao
    Yang, Feng
    Peng, Jichang
    Gao, Fei
    IEEE TRANSACTIONS ON ENERGY CONVERSION, 2024, 39 (02) : 1183 - 1191
  • [22] Cascade control of underactuated manipulator based on reinforcement learning framework
    Jiang, Naijing
    Guo, Dingxu
    Zhang, Shu
    Zhang, Dan
    Xu, Jian
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2023, 237 (02) : 231 - 243
  • [23] Reinforcement learning-based adaptive tracking control for flexible-joint robotic manipulators
    Zhong, Huihui
    Wen, Weijian
    Fan, Jianjun
    Yang, Weijun
    AIMS MATHEMATICS, 2024, 9 (10): : 27330 - 27360
  • [24] Adaptive sliding mode control of robotic manipulator based on reinforcement learning
    Ren, Ziwu
    Chen, Jie
    Miao, Yunxi
    Miao, Yujie
    Guo, Zibo
    Hu, Biao
    Lin, Rui
    ASIAN JOURNAL OF CONTROL, 2024, 26 (05) : 2703 - 2718
  • [25] Vision-based Deep Reinforcement Learning to Control a Manipulator
    Kim, Wonchul
    Kim, Taewan
    Lee, Jonggu
    Kim, H. Jin
    2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 1046 - 1050
  • [26] Impedance Control of Space Manipulator Based on Deep Reinforcement Learning
    Sun, Yu
    Cao, Heyang
    Ma, Rui
    Wang, Guan
    Ma, Guangcheng
    Xia, Hongwei
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3609 - 3614
  • [27] Deep Reinforcement Learning Based on Proximal Policy Optimization for the Maintenance of a Wind Farm with Multiple Crews
    Pinciroli, Luca
    Baraldi, Piero
    Ballabio, Guido
    Compare, Michele
    Zio, Enrico
    ENERGIES, 2021, 14 (20)
  • [28] A Deep Reinforcement Learning Framework for Control of Robotic Manipulators in Simulated Environments
    Calderon-Cordova, Carlos
    Sarango, Roger
    Castillo, Darwin
    Lakshminarayanan, Vasudevan
    IEEE ACCESS, 2024, 12 : 103133 - 103161
  • [29] An Antenna Optimization Framework Based on Deep Reinforcement Learning
    Peng, Fengling
    Chen, Xing
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2024, 72 (10) : 7594 - 7605
  • [30] Research on Manipulator Control Based on Improved Proximal Policy Optimization Algorithm
    Yang, Shaoxiong
    Wu, Di
    Pan, Yan
    He, Yan
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 4301 - 4306