Stability Control of a Biped Robot on a Dynamic Platform Based on Hybrid Reinforcement Learning

被引:5
|
作者
Xi, Ao [1 ]
Chen, Chao [1 ]
机构
[1] Monash Univ, Lab Mot Generat & Anal, Fac Engn, Clayton, Vic 3800, Australia
关键词
biped robot; reinforcement learning; stability control; Gaussian processes; DQN (lambda);
D O I
10.3390/s20164468
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In this work, we introduced a novel hybrid reinforcement learning scheme to balance a biped robot (NAO) on an oscillating platform, where the rotation of the platform is considered as the external disturbance to the robot. The platform had two degrees of freedom in rotation, pitch and roll. The state space comprised the position of center of pressure, and joint angles and joint velocities of two legs. The action space consisted of the joint angles of ankles, knees, and hips. By adding the inverse kinematics techniques, the dimension of action space was significantly reduced. Then, a model-based system estimator was employed during the offline training procedure to estimate the dynamics model of the system by using novel hierarchical Gaussian processes, and to provide initial control inputs, after which the reduced action space of each joint was obtained by minimizing the cost of reaching the desired stable state. Finally, a model-free optimizer based on DQN (lambda) was introduced to fine tune the initial control inputs, where the optimal control inputs were obtained for each joint at any state. The proposed reinforcement learning not only successfully avoided the distribution mismatch problem, but also improved the sample efficiency. Simulation results showed that the proposed hybrid reinforcement learning mechanism enabled the NAO robot to balance on an oscillating platform with different frequencies and magnitudes. Both control performance and robustness were guaranteed during the experiments.
引用
收藏
页码:1 / 21
页数:22
相关论文
共 50 条
  • [1] Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning
    Xi, Ao
    Mudiyanselage, Thushal Wijekoon
    Tao, Dacheng
    Chen, Chao
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 6 (04) : 938 - 951
  • [2] Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning
    Ao Xi
    Thushal Wijekoon Mudiyanselage
    Dacheng Tao
    Chao Chen
    [J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (04) : 938 - 951
  • [3] Hybrid reinforcement learning and its application to biped robot control
    Yamada, S
    Watanabe, A
    Nakashima, M
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1071 - 1077
  • [4] Walking Control of a Biped Robot on Static and Rotating Platforms Based on Hybrid Reinforcement Learning
    Xi, Ao
    Chen, Chao
    [J]. IEEE ACCESS, 2020, 8 : 148411 - 148424
  • [5] Posture self-stabilizer of a biped robot based on training platform and reinforcement learning
    Wu, Weiguo
    Gao, Liyang
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 98 : 42 - 55
  • [6] A Disturbance Rejection Control Method Based on Deep Reinforcement Learning for a Biped Robot
    Liu, Chuzhao
    Gao, Junyao
    Tian, Dingkui
    Zhang, Xuefeng
    Liu, Huaxin
    Meng, Libo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 17
  • [7] Gait Balance of Biped Robot based on Reinforcement Learning
    Hwang, Kao-Shing
    Li, Jhe-Syun
    Jiang, Wei-Cheng
    Wang, Wei-Han
    [J]. 2013 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2013, : 435 - 439
  • [8] A dynamic hybrid control of biped robot in supporting area
    Morisawa, M
    Ohnishi, K
    [J]. 8TH IEEE INTERNATIONAL WORKSHOP ON ADVANCED MOTION CONTROL, PROCEEDINGS, 2004, : 381 - 386
  • [9] Deep reinforcement learning method for biped robot gait control
    Feng C.
    Zhang Y.
    Huang C.
    Jiang W.
    Wu Z.
    [J]. 1600, CIMS (27): : 2341 - 2349
  • [10] Reinforcement learning control for biped robot walking on uneven surfaces
    Wang, Shouyi
    Braaksma, Jelmer
    Babuska, Robert
    Hobbelen, Daan
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 4173 - +