Safe Reinforcement Learning for Legged Locomotion

被引：6

作者：

Yang, Tsung-Yen ^{[1
,2
]}

Zhang, Tingnan ^{[2
]}

Luu, Linda ^{[2
]}

Ha, Sehoon ^{[2
,3
]}

Tan, Jie ^{[2
]}

Yu, Wenhao ^{[2
]}

机构：

[1] Princeton Univ, Princeton, NJ 08544 USA

[2] Google Res, Mountain View, CA 94043 USA

[3] Georgia Inst Technol, Atlanta, GA 30332 USA

来源：

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2022年

关键词：

D O I：

10.1109/IROS47612.2022.9982038

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Designing control policies for legged locomotion(1) is complex due to the under-actuated and non-continuous robot dynamics. Model-free reinforcement learning provides promising tools to tackle this challenge. However, a major bottleneck of applying model-free reinforcement learning in real world is safety. In this paper, we propose a safe reinforcement learning framework that switches between a safe recovery policy that prevents the robot from entering unsafe states, and a learner policy that is optimized to complete the task. The safe recovery policy takes over the control when the learner policy violates safety constraints, and hands over the control back when there are no future safety violations. We design the safe recovery policy so that it ensures safety of quadruped locomotion while minimally intervening in the learning process. Furthermore, we theoretically analyze the proposed framework and provide an upper bound on the task performance. We verify the proposed framework in four tasks on a simulated and real quadrupedal robot: efficient gait, catwalk, two-leg balance, and pacing. On average, our method achieves 48.6% fewer falls and comparable or better rewards than the baseline methods in simulation. When deployed it on real-world quadruped robot, our training pipeline enables 34% improvement in energy efficiency for the efficient gait, 40.9% narrower of the feet placement in the catwalk, and two times more jumping duration in the two-leg balance. Our method achieves less than five falls over the duration of 115 minutes of hardware time. (2)

引用

页码：2454 / 2461

页数：8

共 50 条

[1] Reinforcement Learning of Single Legged Locomotion
Fankhauser, Peter
Hutter, Marco
Gehring, Christian
Bloesch, Michael
Hoepflinger, Mark A.
Siegwart, Roland
[J]. 2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 188 - 193
[2] CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion
Wang, Hongxi
Luo, Haoxiang
Zhang, Wei
Chen, Hua
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9191 - 9198
[3] Modular Deep Reinforcement Learning for Emergent Locomotion on a Six-Legged Robot
Schilling, Malte
Konen, Kai
Korthals, Timo
[J]. 2020 8TH IEEE RAS/EMBS INTERNATIONAL CONFERENCE FOR BIOMEDICAL ROBOTICS AND BIOMECHATRONICS (BIOROB), 2020, : 946 - 953
[4] Multi-Modal Legged Locomotion Framework With Automated Residual Reinforcement Learning
Yu, Chen
Rosendo, Andre
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04): : 10312 - 10319
[5] Adaptive Locomotion Control of Sixteen-legged Robot based on Deep Reinforcement Learning
Mu, Xixi
Shao, Shibo
Zhang, Dong
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021), 2021, : 992 - 997
[6] RLOC: Terrain-Aware Legged Locomotion Using Reinforcement Learning and Optimal Control
Gangapurwala, Siddhant
Geisert, Mathieu
Orsolino, Romeo
Fallon, Maurice
Havoutis, Ioannis
[J]. IEEE TRANSACTIONS ON ROBOTICS, 2022, 38 (05) : 2908 - 2927
[7] Safe Locomotion Within Confined Workspace using Deep Reinforcement Learning
Dastider, Apan
Raza, Sayyed Jaffar Ali
Lin, Mingjie
[J]. 2021 FIFTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2021), 2021, : 111 - 114
[8] Tuning Legged Locomotion Controllers via Safe Bayesian Optimization
Widmer, Daniel
Kang, Dongho
Sukhija, Bhavya
Hubotter, Jonas
Krause, Andreas
Coros, Stelian
[J]. CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
[9] Learning a Centroidal Motion Planner for Legged Locomotion
Viereck, Julian
Righetti, Ludovic
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4905 - 4911
[10] Optimization and learning for rough terrain legged locomotion
Zucker, Matt
Ratliff, Nathan
Stolle, Martin
Chestnutt, Joel
Bagnell, J. Andrew
Atkeson, Christopher G.
Kuffner, James
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (02): : 175 - 191

← 1 2 3 4 5 →