Policy gradient reinforcement learning for fast quadrupedal locomotion

被引：249

作者：

Kohl, N ^{[1
]}

Stone, P ^{[1
]}

机构：

[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS | 2004年

关键词：

learning control; walking robots; multi legged robots;

D O I：

10.1109/ROBOT.2004.1307456

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a machine learning approach to optimizing a quadrupedal trot gait for forward speed. Given a parameterized walk designed for a specific robot, we propose using a form of policy gradient reinforcement learning to automatically search the set of possible parameters with the goal of finding the fastest possible walk. We implement and test our approach on a commercially available quadrupedal robot platform, namely the Sony Aibo robot. After about three hours of learning, all on the physical robots and with no human intervention other than to change the batteries, the robots achieved a gait faster than any previously known gait known for the Aibo, significantly outperforming a variety of existing hand-coded and learned solutions.

引用

页码：2619 / 2624

页数：6

共 50 条

[21] A modification of gradient policy in reinforcement learning procedure
Abas, Marcel
Skripcak, Tomas
[J]. 2012 15TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2012,
[22] Adaptive Natural Policy Gradient in Reinforcement Learning
Li, Dazi
Qiao, Zengyuan
Song, Tianheng
Jin, Qibing
[J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
[23] Policy Gradient Method For Robust Reinforcement Learning
Wang, Yue
Zou, Shaofeng
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[24] Reinforcement Learning to Rank with Pairwise Policy Gradient
Xu, Jun
Wei, Zeng
Xia, Long
Lan, Yanyan
Yin, Dawei
Cheng, Xueqi
Wen, Ji-Rong
[J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 509 - 518
[25] Scalable Multitask Policy Gradient Reinforcement Learning
El Bsat, Salam
Ammar, Haitham Bou
Taylor, Matthew E.
[J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1847 - 1853
[26] Toward Fast Policy Search for Learning Legged Locomotion
Deisenroth, Marc Peter
Calandra, Roberto
Seyfarth, Andre
Peters, Jan
[J]. 2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 1787 - 1792
[27] Hierarchical Terrain-Aware Control for Quadrupedal Locomotion by Combining Deep Reinforcement Learning and Optimal Control
Yao, Qingfeng
Wang, Jilong
Wang, Donglin
Yang, Shuyu
Zhang, Hongyin
Wang, Yinuo
Wu, Zhengqing
[J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4546 - 4551
[28] CPG-Based Hierarchical Locomotion Control for Modular Quadrupedal Robots Using Deep Reinforcement Learning
Wang, Jiayu
Hu, Chuxiong
Zhu, Yu
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) : 7193 - 7200
[29] Learning Advanced Locomotion for Quadrupedal Robots: A Distributed Multi-Agent Reinforcement Learning Framework with Riemannian Motion Policies
Wang, Yuliu
Sagawa, Ryusuke
Yoshiyasu, Yusuke
[J]. ROBOTICS, 2024, 13 (06)
[30] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Kim, Dong-Ki
Liu, Miao
Riemer, Matthew
Sun, Chuangchuang
Abdulhai, Marwa
Habibi, Golnaz
Lopez-Cot, Sebastian
Tesauro, Gerald
How, Jonathan P.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139

← 1 2 3 4 5 →