Policy gradient reinforcement learning for fast quadrupedal locomotion

被引:249
|
作者
Kohl, N [1 ]
Stone, P [1 ]
机构
[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA
关键词
learning control; walking robots; multi legged robots;
D O I
10.1109/ROBOT.2004.1307456
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a machine learning approach to optimizing a quadrupedal trot gait for forward speed. Given a parameterized walk designed for a specific robot, we propose using a form of policy gradient reinforcement learning to automatically search the set of possible parameters with the goal of finding the fastest possible walk. We implement and test our approach on a commercially available quadrupedal robot platform, namely the Sony Aibo robot. After about three hours of learning, all on the physical robots and with no human intervention other than to change the batteries, the robots achieved a gait faster than any previously known gait known for the Aibo, significantly outperforming a variety of existing hand-coded and learned solutions.
引用
收藏
页码:2619 / 2624
页数:6
相关论文
共 50 条
  • [1] Machine learning for fast quadrupedal locomotion
    Kohl, N
    Stone, P
    [J]. PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 611 - 616
  • [2] Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion
    Kim, Myeongseop
    Kim, Jung-Su
    Park, Jae-Han
    [J]. ELECTRONICS, 2024, 13 (01)
  • [3] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
    Wei, Lang
    Li, Yunxiang
    Ai, Yunfei
    Wu, Yuze
    Xu, Hao
    Wang, Wei
    [J]. INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING, 2023, 24 (9) : 1599 - 1613
  • [4] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
    Lang Wei
    Yunxiang Li
    Yunfei Ai
    Yuze Wu
    Hao Xu
    Wei Wang
    Guoming Hu
    [J]. International Journal of Precision Engineering and Manufacturing, 2023, 24 : 1599 - 1613
  • [5] Learning in a high dimensional space: Fast omnidirectional quadrupedal locomotion
    Hebbel, Matthias
    Nistico, Walter
    Fisseler, Denis
    [J]. ROBOCUP 2006: ROBOT SOCCER WORLD CUP X, 2007, 4434 : 314 - +
  • [6] Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum
    Kobayashi, Taisuke
    Sugino, Toshiki
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95
  • [7] Reinforcement Learning With Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion
    Shi, Haojie
    Zhou, Bo
    Zeng, Hongsheng
    Wang, Fan
    Dong, Yueqiang
    Li, Jiangyong
    Wang, Kang
    Tian, Hao
    Meng, Max Q-H
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 3085 - 3092
  • [8] Quadrupedal Locomotion in an Energy-efficient Way Based on Reinforcement Learning
    Hao, Tiantian
    Xu, De
    Yan, Shaohua
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1613 - 1623
  • [9] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
    Jin, Yongbin
    Liu, Xianwei
    Shao, Yecheng
    Wang, Hongtao
    Yang, Wei
    [J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (12) : 1198 - 1208
  • [10] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
    Yongbin Jin
    Xianwei Liu
    Yecheng Shao
    Hongtao Wang
    Wei Yang
    [J]. Nature Machine Intelligence, 2022, 4 : 1198 - 1208