Reinforcement Learning With Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

被引:0
|
作者
Shi, Haojie [1 ,2 ]
Zhou, Bo [2 ]
Zeng, Hongsheng [2 ]
Wang, Fan [2 ]
Dong, Yueqiang [2 ]
Li, Jiangyong [2 ]
Wang, Kang [2 ]
Tian, Hao [2 ]
Meng, Max Q-H [3 ,4 ]
机构
[1] Chinese Univ Hong Kong, Shenzhen 518057, Peoples R China
[2] Baidu, Beijing 100193, Peoples R China
[3] Southern Univ Sci & Technol, Shenzhen Key Lab Robot Percept & Intelligence, Dept Elect & Elect Engn, Shenzhen 518055, Peoples R China
[4] Chinese Univ Hong Kong, Dept Elect Engn, Shenzhen 518057, Peoples R China
关键词
Legged robots; machine learning for robot control; reinforcement learning; ROBOTS;
D O I
10.1109/LRA.2022.3145495
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recently reinforcement learning (RL) has emerged as a promising approach for quadrupedal locomotion, which can save the manual effort in conventional approaches such as designing skill-specific controllers. However, due to the complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is still difficult for RL to learn effective gaits from scratch, especially in challenging tasks such as walking over the balance beam. To alleviate such difficulty, we propose a novel RL-based approach that contains an evolutionary foot trajectory generator. Unlike prior methods that use a fixed trajectory generator, the generator continually optimizes the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning. The policy is trained with reinforcement learning to output residual control signals that fit different gaits. We then optimize the trajectory generator and policy network alternatively to stabilize the training and share the exploratory data to improve sample efficiency. As a result, our approach can solve a range of challenging tasks in simulation by learning from scratch, including walking on a balance beam and crawling through the cave. To further verify the effectiveness of our approach, we deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits. We provide a video to show the learned gaits in different tasks in YouTube.(1)
引用
收藏
页码:3085 / 3092
页数:8
相关论文
共 50 条
  • [1] Real-Time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning
    Gangapurwala, Siddhant
    Geisert, Mathieu
    Orsolino, Romeo
    Fallon, Maurice
    Havoutis, Ioannis
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 5973 - 5979
  • [2] Policy gradient reinforcement learning for fast quadrupedal locomotion
    Kohl, N
    Stone, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2619 - 2624
  • [3] Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion
    Kim, Myeongseop
    Kim, Jung-Su
    Park, Jae-Han
    [J]. ELECTRONICS, 2024, 13 (01)
  • [4] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
    Wei, Lang
    Li, Yunxiang
    Ai, Yunfei
    Wu, Yuze
    Xu, Hao
    Wang, Wei
    [J]. INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING, 2023, 24 (9) : 1599 - 1613
  • [5] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
    Lang Wei
    Yunxiang Li
    Yunfei Ai
    Yuze Wu
    Hao Xu
    Wei Wang
    Guoming Hu
    [J]. International Journal of Precision Engineering and Manufacturing, 2023, 24 : 1599 - 1613
  • [6] Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum
    Kobayashi, Taisuke
    Sugino, Toshiki
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95
  • [7] Quadrupedal Locomotion in an Energy-efficient Way Based on Reinforcement Learning
    Hao, Tiantian
    Xu, De
    Yan, Shaohua
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1613 - 1623
  • [8] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
    Jin, Yongbin
    Liu, Xianwei
    Shao, Yecheng
    Wang, Hongtao
    Yang, Wei
    [J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (12) : 1198 - 1208
  • [9] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
    Yongbin Jin
    Xianwei Liu
    Yecheng Shao
    Hongtao Wang
    Wei Yang
    [J]. Nature Machine Intelligence, 2022, 4 : 1198 - 1208
  • [10] The Evolutionary Locomotion of Tripedal and Quadrupedal Biomorphic Robots
    Qiu, Guo-Yuan
    Wu, Shih-Hung
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2013, 29 (04) : 681 - 693