Balance Control of a Biped Robot on a Rotating Platform Based on Efficient Reinforcement Learning

被引：28

作者：

Xi, Ao ^{[1
]}

Mudiyanselage, Thushal Wijekoon ^{[2
]}

Tao, Dacheng ^{[3
,4
]}

Chen, Chao ^{[1
]}

机构：

[1] Monash Univ, Dept Mech & Aerosp Engn, Clayton, Vic 3800, Australia

[2] Monash Univ, Clayton, Vic 3800, Australia

[3] Univ Sydney, UBTECH Sydney Artificial Intelligence Ctr, Fac Engn, Sydney, NSW 2008, Australia

[4] Univ Sydney, Sch Comp Sci, Fac Engn, Sydney, NSW 2008, Australia

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2019年 / 6卷 / 04期

关键词：

Biped robot; Gaussian processes (GP); reinforcement learning; temporal difference; HUMANOID ROBOT;

D O I：

10.1109/JAS.2019.1911567

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this work, we combined the model based reinforcement learning (MBRL) and model free reinforcement learning (MFRL) to stabilize a biped robot (NAO robot) on a rotating platform, where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance. Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model Although some improved method such as probabilistic inference for learning control (PILCO) does not require an explicit global model as the actions are obtained by directly searching the policy space, the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system. Besides, none of these approaches consider the data error and measurement noise during the training process and test process, respectively. We propose a hierarchical Gaussian processes (GP) models, containing two layers of independent GPs, where the physically continuous probability transition model of the robot is obtained. Due to the physically continuous estimation, the algorithm overcomes the overfitting problem with a guaranteed model complexity, and the number of training data is also reduced. The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state. Furthermore, a novel Q(lambda) based MFRL method scheme is employed to improve the policy. Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform, and it is capable of adapting to the platform with varying angular velocity.

引用

页码：938 / 951

页数：14

共 50 条

[41] Erratum to: Balance recovery control for biped robot based on reaction null space method
Dragomir N. Nenchev
[J]. Journal of Control Theory and Applications, 2010, 8 (4): : 549 - 549
[42] Dynamic Balance Control of Biped Robot Using Optimized SLFNs
Yang, Liang
Liu, Zhi
Zhang, Yun
[J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 5303 - 5307
[43] Comment on "Balance recovery control for biped robot based on reaction null space method"
Baoping Wang
[J]. Control Theory and Technology, 2010, (04) : 548 - 548
[44] Reinforcement learning for robot control
Smart, WD
Kaelbling, LP
[J]. MOBILE ROBOTS XVI, 2002, 4573 : 92 - 103
[45] Model-Based Reinforcement Learning For Robot Control
Li, Xiang
Shang, Weiwei
Cong, Shuang
[J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 300 - 305
[46] Dynamic Balance of the Head in a Flexible Legged Robot for Efficient Biped Locomotion
Lisitano, Domenico
Bonisoli, Elvio
Recchiuto, Carmine Tommaso
Muscolo, Giovanni Gerardo
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (07):
[47] Erratum to:Balance recovery control for biped robot based on reaction null space method
Journal of Control Theory and Applications
[J]. Control Theory and Technology, 2010, 8 (04) : 549 - 549
[48] Position-Based Lateral Balance Control for Knee-Stretched Biped Robot
Kajita, Shuuji
Benallegue, Mehdi
Cisneros, Rafael
Sakaguchi, Takeshi
Morisawa, Mitsuharu
Kaminaga, Hiroshi
Kumagai, Iori
Kaneko, Kenji
Kanehiro, Fumio
[J]. 2019 IEEE-RAS 19TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2019, : 17 - 24
[49] Anti-push Method of Biped Robot Based on Motion Capture Point and Reinforcement Learning
Wang, Song
Piao, Songhao
Leng, Xiaokun
Chang, Lin
He, Zhicheng
[J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2020), 2020, : 408 - 413
[50] SofaGym: An Open Platform for Reinforcement Learning Based on Soft Robot Simulations
Schegg, Pierre
Menager, Etienne
Khairallah, Elie
Marchal, Damien
Dequidt, Jeremie
Preux, Philippe
Duriez, Christian
[J]. SOFT ROBOTICS, 2023, 10 (02) : 410 - 430

← 1 2 3 4 5 →