Reinforcement Learning With Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

被引：0

作者：

Shi, Haojie ^{[1
,2
]}

Zhou, Bo ^{[2
]}

Zeng, Hongsheng ^{[2
]}

Wang, Fan ^{[2
]}

Dong, Yueqiang ^{[2
]}

Li, Jiangyong ^{[2
]}

Wang, Kang ^{[2
]}

Tian, Hao ^{[2
]}

Meng, Max Q-H ^{[3
,4
]}

机构：

[1] Chinese Univ Hong Kong, Shenzhen 518057, Peoples R China

[2] Baidu, Beijing 100193, Peoples R China

[3] Southern Univ Sci & Technol, Shenzhen Key Lab Robot Percept & Intelligence, Dept Elect & Elect Engn, Shenzhen 518055, Peoples R China

[4] Chinese Univ Hong Kong, Dept Elect Engn, Shenzhen 518057, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 02期

关键词：

Legged robots; machine learning for robot control; reinforcement learning; ROBOTS;

D O I：

10.1109/LRA.2022.3145495

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Recently reinforcement learning (RL) has emerged as a promising approach for quadrupedal locomotion, which can save the manual effort in conventional approaches such as designing skill-specific controllers. However, due to the complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is still difficult for RL to learn effective gaits from scratch, especially in challenging tasks such as walking over the balance beam. To alleviate such difficulty, we propose a novel RL-based approach that contains an evolutionary foot trajectory generator. Unlike prior methods that use a fixed trajectory generator, the generator continually optimizes the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning. The policy is trained with reinforcement learning to output residual control signals that fit different gaits. We then optimize the trajectory generator and policy network alternatively to stabilize the training and share the exploratory data to improve sample efficiency. As a result, our approach can solve a range of challenging tasks in simulation by learning from scratch, including walking on a balance beam and crawling through the cave. To further verify the effectiveness of our approach, we deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits. We provide a video to show the learned gaits in different tasks in YouTube.(1)

引用

页码：3085 / 3092

页数：8

共 50 条

[1] Real-Time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning
Gangapurwala, Siddhant
Geisert, Mathieu
Orsolino, Romeo
Fallon, Maurice
Havoutis, Ioannis
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 5973 - 5979
[2] Policy gradient reinforcement learning for fast quadrupedal locomotion
Kohl, N
Stone, P
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2619 - 2624
[3] Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion
Kim, Myeongseop
Kim, Jung-Su
Park, Jae-Han
[J]. ELECTRONICS, 2024, 13 (01)
[4] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
Wei, Lang
Li, Yunxiang
Ai, Yunfei
Wu, Yuze
Xu, Hao
Wang, Wei
[J]. INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING, 2023, 24 (9) : 1599 - 1613
[5] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
Lang Wei
Yunxiang Li
Yunfei Ai
Yuze Wu
Hao Xu
Wei Wang
Guoming Hu
[J]. International Journal of Precision Engineering and Manufacturing, 2023, 24 : 1599 - 1613
[6] Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum
Kobayashi, Taisuke
Sugino, Toshiki
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95
[7] Quadrupedal Locomotion in an Energy-efficient Way Based on Reinforcement Learning
Hao, Tiantian
Xu, De
Yan, Shaohua
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1613 - 1623
[8] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
Jin, Yongbin
Liu, Xianwei
Shao, Yecheng
Wang, Hongtao
Yang, Wei
[J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (12) : 1198 - 1208
[9] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
Yongbin Jin
Xianwei Liu
Yecheng Shao
Hongtao Wang
Wei Yang
[J]. Nature Machine Intelligence, 2022, 4 : 1198 - 1208
[10] The Evolutionary Locomotion of Tripedal and Quadrupedal Biomorphic Robots
Qiu, Guo-Yuan
Wu, Shih-Hung
[J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2013, 29 (04) : 681 - 693

← 1 2 3 4 5 →