Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning

被引:28
|
作者
Xi, Ao [1 ]
Mudiyanselage, Thushal Wijekoon [2 ]
Tao, Dacheng [3 ,4 ]
Chen, Chao [1 ]
机构
[1] Monash Univ, Dept Mech & Aerosp Engn, Clayton, Vic 3800, Australia
[2] Monash Univ, Clayton, Vic 3800, Australia
[3] Univ Sydney, UBTECH Sydney Artificial Intelligence Ctr, Fac Engn, Sydney, NSW 2008, Australia
[4] Univ Sydney, Sch Comp Sci, Fac Engn, Sydney, NSW 2008, Australia
关键词
Biped robot; Gaussian processes (GP); reinforcement learning; temporal difference; HUMANOID ROBOT;
D O I
10.1109/JAS.2019.1911567
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we combined the model based reinforcement learning (MBRL) and model free reinforcement learning (MFRL) to stabilize a biped robot (NAO robot) on a rotating platform, where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance. Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model Although some improved method such as probabilistic inference for learning control (PILCO) does not require an explicit global model as the actions are obtained by directly searching the policy space, the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system. Besides, none of these approaches consider the data error and measurement noise during the training process and test process, respectively. We propose a hierarchical Gaussian processes (GP) models, containing two layers of independent GPs, where the physically continuous probability transition model of the robot is obtained. Due to the physically continuous estimation, the algorithm overcomes the overfitting problem with a guaranteed model complexity, and the number of training data is also reduced. The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state. Furthermore, a novel Q(lambda) based MFRL method scheme is employed to improve the policy. Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform, and it is capable of adapting to the platform with varying angular velocity.
引用
收藏
页码:938 / 951
页数:14
相关论文
共 50 条
  • [41] Erratum to: Balance recovery control for biped robot based on reaction null space method
    Dragomir N. Nenchev
    [J]. Journal of Control Theory and Applications, 2010, 8 (4): : 549 - 549
  • [42] Dynamic Balance Control of Biped Robot Using Optimized SLFNs
    Yang, Liang
    Liu, Zhi
    Zhang, Yun
    [J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 5303 - 5307
  • [44] Reinforcement learning for robot control
    Smart, WD
    Kaelbling, LP
    [J]. MOBILE ROBOTS XVI, 2002, 4573 : 92 - 103
  • [45] Model-Based Reinforcement Learning For Robot Control
    Li, Xiang
    Shang, Weiwei
    Cong, Shuang
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 300 - 305
  • [46] Dynamic Balance of the Head in a Flexible Legged Robot for Efficient Biped Locomotion
    Lisitano, Domenico
    Bonisoli, Elvio
    Recchiuto, Carmine Tommaso
    Muscolo, Giovanni Gerardo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [47] Erratum to:Balance recovery control for biped robot based on reaction null space method
    Journal of Control Theory and Applications
    [J]. Control Theory and Technology, 2010, 8 (04) : 549 - 549
  • [48] Position-Based Lateral Balance Control for Knee-Stretched Biped Robot
    Kajita, Shuuji
    Benallegue, Mehdi
    Cisneros, Rafael
    Sakaguchi, Takeshi
    Morisawa, Mitsuharu
    Kaminaga, Hiroshi
    Kumagai, Iori
    Kaneko, Kenji
    Kanehiro, Fumio
    [J]. 2019 IEEE-RAS 19TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2019, : 17 - 24
  • [49] Anti-push Method of Biped Robot Based on Motion Capture Point and Reinforcement Learning
    Wang, Song
    Piao, Songhao
    Leng, Xiaokun
    Chang, Lin
    He, Zhicheng
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 408 - 413
  • [50] SofaGym: An Open Platform for Reinforcement Learning Based on Soft Robot Simulations
    Schegg, Pierre
    Menager, Etienne
    Khairallah, Elie
    Marchal, Damien
    Dequidt, Jeremie
    Preux, Philippe
    Duriez, Christian
    [J]. SOFT ROBOTICS, 2023, 10 (02) : 410 - 430