Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning

被引:28
|
作者
Xi, Ao [1 ]
Mudiyanselage, Thushal Wijekoon [2 ]
Tao, Dacheng [3 ,4 ]
Chen, Chao [1 ]
机构
[1] Monash Univ, Dept Mech & Aerosp Engn, Clayton, Vic 3800, Australia
[2] Monash Univ, Clayton, Vic 3800, Australia
[3] Univ Sydney, UBTECH Sydney Artificial Intelligence Ctr, Fac Engn, Sydney, NSW 2008, Australia
[4] Univ Sydney, Sch Comp Sci, Fac Engn, Sydney, NSW 2008, Australia
关键词
Biped robot; Gaussian processes (GP); reinforcement learning; temporal difference; HUMANOID ROBOT;
D O I
10.1109/JAS.2019.1911567
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we combined the model based reinforcement learning (MBRL) and model free reinforcement learning (MFRL) to stabilize a biped robot (NAO robot) on a rotating platform, where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance. Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model Although some improved method such as probabilistic inference for learning control (PILCO) does not require an explicit global model as the actions are obtained by directly searching the policy space, the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system. Besides, none of these approaches consider the data error and measurement noise during the training process and test process, respectively. We propose a hierarchical Gaussian processes (GP) models, containing two layers of independent GPs, where the physically continuous probability transition model of the robot is obtained. Due to the physically continuous estimation, the algorithm overcomes the overfitting problem with a guaranteed model complexity, and the number of training data is also reduced. The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state. Furthermore, a novel Q(lambda) based MFRL method scheme is employed to improve the policy. Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform, and it is capable of adapting to the platform with varying angular velocity.
引用
收藏
页码:938 / 951
页数:14
相关论文
共 50 条
  • [1] Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning
    Ao Xi
    Thushal Wijekoon Mudiyanselage
    Dacheng Tao
    Chao Chen
    [J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (04) : 938 - 951
  • [2] Gait Balance of Biped Robot based on Reinforcement Learning
    Hwang, Kao-Shing
    Li, Jhe-Syun
    Jiang, Wei-Cheng
    Wang, Wei-Han
    [J]. 2013 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2013, : 435 - 439
  • [3] Stability Control of a Biped Robot on a Dynamic Platform Based on Hybrid Reinforcement Learning
    Xi, Ao
    Chen, Chao
    [J]. SENSORS, 2020, 20 (16) : 1 - 21
  • [4] Walking Control of a Biped Robot on Static and Rotating Platforms Based on Hybrid Reinforcement Learning
    Xi, Ao
    Chen, Chao
    [J]. IEEE ACCESS, 2020, 8 : 148411 - 148424
  • [5] Biped Balance Control by Reinforcement Learning
    Hwang, Kao-Shing
    Lin, Jin-Ling
    Li, Jhe-Syun
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2016, 32 (04) : 1041 - 1060
  • [6] Posture self-stabilizer of a biped robot based on training platform and reinforcement learning
    Wu, Weiguo
    Gao, Liyang
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 98 : 42 - 55
  • [7] Dynamic balance of a biped robot using fuzzy reinforcement learning agents
    Zhou, CJ
    Meng, QC
    [J]. FUZZY SETS AND SYSTEMS, 2003, 134 (01) : 169 - 187
  • [8] A Disturbance Rejection Control Method Based on Deep Reinforcement Learning for a Biped Robot
    Liu, Chuzhao
    Gao, Junyao
    Tian, Dingkui
    Zhang, Xuefeng
    Liu, Huaxin
    Meng, Libo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 17
  • [9] Learning an Efficient Gait Cycle of a Biped Robot Based on Reinforcement Learning and Artificial Neural Networks
    Gil, Cristyan R.
    Calvo, Hiram
    Sossa, Humberto
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (03):
  • [10] Deep reinforcement learning method for biped robot gait control
    Feng, Chun
    Zhang, Yiwei
    Huang, Cheng
    Jiang, Wenbiao
    Wu, Zhiwei
    [J]. Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2021, 27 (08): : 2341 - 2349