Policy gradient reinforcement learning for fast quadrupedal locomotion

被引:249
|
作者
Kohl, N [1 ]
Stone, P [1 ]
机构
[1] Univ Texas, Dept Comp Sci, Austin, TX 78712 USA
关键词
learning control; walking robots; multi legged robots;
D O I
10.1109/ROBOT.2004.1307456
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a machine learning approach to optimizing a quadrupedal trot gait for forward speed. Given a parameterized walk designed for a specific robot, we propose using a form of policy gradient reinforcement learning to automatically search the set of possible parameters with the goal of finding the fastest possible walk. We implement and test our approach on a commercially available quadrupedal robot platform, namely the Sony Aibo robot. After about three hours of learning, all on the physical robots and with no human intervention other than to change the batteries, the robots achieved a gait faster than any previously known gait known for the Aibo, significantly outperforming a variety of existing hand-coded and learned solutions.
引用
收藏
页码:2619 / 2624
页数:6
相关论文
共 50 条
  • [21] A modification of gradient policy in reinforcement learning procedure
    Abas, Marcel
    Skripcak, Tomas
    [J]. 2012 15TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2012,
  • [22] Adaptive Natural Policy Gradient in Reinforcement Learning
    Li, Dazi
    Qiao, Zengyuan
    Song, Tianheng
    Jin, Qibing
    [J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
  • [23] Policy Gradient Method For Robust Reinforcement Learning
    Wang, Yue
    Zou, Shaofeng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [24] Reinforcement Learning to Rank with Pairwise Policy Gradient
    Xu, Jun
    Wei, Zeng
    Xia, Long
    Lan, Yanyan
    Yin, Dawei
    Cheng, Xueqi
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 509 - 518
  • [25] Scalable Multitask Policy Gradient Reinforcement Learning
    El Bsat, Salam
    Ammar, Haitham Bou
    Taylor, Matthew E.
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1847 - 1853
  • [26] Toward Fast Policy Search for Learning Legged Locomotion
    Deisenroth, Marc Peter
    Calandra, Roberto
    Seyfarth, Andre
    Peters, Jan
    [J]. 2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 1787 - 1792
  • [27] Hierarchical Terrain-Aware Control for Quadrupedal Locomotion by Combining Deep Reinforcement Learning and Optimal Control
    Yao, Qingfeng
    Wang, Jilong
    Wang, Donglin
    Yang, Shuyu
    Zhang, Hongyin
    Wang, Yinuo
    Wu, Zhengqing
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4546 - 4551
  • [28] CPG-Based Hierarchical Locomotion Control for Modular Quadrupedal Robots Using Deep Reinforcement Learning
    Wang, Jiayu
    Hu, Chuxiong
    Zhu, Yu
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) : 7193 - 7200
  • [29] Learning Advanced Locomotion for Quadrupedal Robots: A Distributed Multi-Agent Reinforcement Learning Framework with Riemannian Motion Policies
    Wang, Yuliu
    Sagawa, Ryusuke
    Yoshiyasu, Yusuke
    [J]. ROBOTICS, 2024, 13 (06)
  • [30] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139