Policy gradient reinforcement learning for fast quadrupedal locomotion

被引：249

作者：

Kohl, N ^{[1
]}

Stone, P ^{[1
]}

机构：

[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS | 2004年

关键词：

learning control; walking robots; multi legged robots;

D O I：

10.1109/ROBOT.2004.1307456

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a machine learning approach to optimizing a quadrupedal trot gait for forward speed. Given a parameterized walk designed for a specific robot, we propose using a form of policy gradient reinforcement learning to automatically search the set of possible parameters with the goal of finding the fastest possible walk. We implement and test our approach on a commercially available quadrupedal robot platform, namely the Sony Aibo robot. After about three hours of learning, all on the physical robots and with no human intervention other than to change the batteries, the robots achieved a gait faster than any previously known gait known for the Aibo, significantly outperforming a variety of existing hand-coded and learned solutions.

引用

页码：2619 / 2624

页数：6

共 50 条

[1] Machine learning for fast quadrupedal locomotion
Kohl, N
Stone, P
[J]. PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 611 - 616
[2] Automated Hyperparameter Tuning in Reinforcement Learning for Quadrupedal Robot Locomotion
Kim, Myeongseop
Kim, Jung-Su
Park, Jae-Han
[J]. ELECTRONICS, 2024, 13 (01)
[3] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
Wei, Lang
Li, Yunxiang
Ai, Yunfei
Wu, Yuze
Xu, Hao
Wang, Wei
[J]. INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING, 2023, 24 (9) : 1599 - 1613
[4] Learning Multiple-Gait Quadrupedal Locomotion via Hierarchical Reinforcement Learning
Lang Wei
Yunxiang Li
Yunfei Ai
Yuze Wu
Hao Xu
Wei Wang
Guoming Hu
[J]. International Journal of Precision Engineering and Manufacturing, 2023, 24 : 1599 - 1613
[5] Learning in a high dimensional space: Fast omnidirectional quadrupedal locomotion
Hebbel, Matthias
Nistico, Walter
Fisseler, Denis
[J]. ROBOCUP 2006: ROBOT SOCCER WORLD CUP X, 2007, 4434 : 314 - +
[6] Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum
Kobayashi, Taisuke
Sugino, Toshiki
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 95
[7] Reinforcement Learning With Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion
Shi, Haojie
Zhou, Bo
Zeng, Hongsheng
Wang, Fan
Dong, Yueqiang
Li, Jiangyong
Wang, Kang
Tian, Hao
Meng, Max Q-H
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 3085 - 3092
[8] Quadrupedal Locomotion in an Energy-efficient Way Based on Reinforcement Learning
Hao, Tiantian
Xu, De
Yan, Shaohua
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1613 - 1623
[9] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
Jin, Yongbin
Liu, Xianwei
Shao, Yecheng
Wang, Hongtao
Yang, Wei
[J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (12) : 1198 - 1208
[10] High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning
Yongbin Jin
Xianwei Liu
Yecheng Shao
Hongtao Wang
Wei Yang
[J]. Nature Machine Intelligence, 2022, 4 : 1198 - 1208

← 1 2 3 4 5 →