Planning Approximate Exploration Trajectories for Model-Free Reinforcement Learning in Contact-Rich Manipulation

被引：16

作者：

Hoppe, Sabrina ^{[1
,2
]}

Lou, Zhongyu ^{[1
]}

Hennes, Daniel ^{[2
]}

Toussaint, Marc ^{[2
]}

机构：

[1] Bosch Corp Res, D-71272 Stuttgart, Germany

[2] Univ Stuttgart, Machine Learning & Robot Lab, D-70174 Stuttgart, Germany

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2019年 / 4卷 / 04期

关键词：

Deep learning in robotics and automation; dexterous manipulation; learning and adaptive systems;

D O I：

10.1109/LRA.2019.2928212

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Recent progress in deep reinforcement learning has enabled simulated agents to learn complex behavior policies from scratch, but their data complexity often prohibits real-world applications. The learning process can be sped up by expert demonstrations but those can be costly to acquire. We demonstrate that it is possible to employ model-free deep reinforcement learning combined with planning to quickly generate informative data for a manipulation task. In particular, we use an approximate trajectory optimization approach for global exploration based on an upper confidence bound of the advantage function. The advantage is approximated by a network for Q-learning with separately updated streams for state value and advantage that allows ensembles to approximate model uncertainty for one stream only. We evaluate our method on new extensions to the classical peg-in-hole task, one of which is only solvable by active usage of contacts between peg tips and holes. The experimental evaluation suggests that our method explores more relevant areas of the environment and finds exemplar solutions faster-both on a real robot and in simulation. Combining our exploration with learning from demonstration outperforms state-of-the-art model-free reinforcement learning in terms of convergence speed for contact-rich manipulation tasks.

引用

页码：4042 / 4047

页数：6

共 50 条

[1] A review on reinforcement learning for contact-rich robotic manipulation tasks
Elguea-Aguinaco, Inigo
Serrano-Munoz, Antonio
Chrysostomou, Dimitrios
Inziarte-Hidalgo, Ibai
Bogh, Simon
Arana-Arexolaleiba, Nestor
[J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2023, 81
[2] Stability-Guaranteed Reinforcement Learning for Contact-Rich Manipulation
Khader, Shahbaz Abdul
Yin, Hang
Falco, Pietro
Kragic, Danica
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (01): : 1 - 8
[3] A Contact-Safe Reinforcement Learning Framework for Contact-Rich Robot Manipulation
Zhu, Xiang
Kang, Shucheng
Chen, Jianyu
[J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 2476 - 2482
[4] Model-Free Active Exploration in Reinforcement Learning
Russo, Alessio
Proutiere, Alexandre
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] Variable Impedance Skill Learning for Contact-Rich Manipulation
Yang, Quantao
Durr, Alexander
Topp, Elin Anna
Stork, Johannes A.
Stoyanov, Todor
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03): : 8391 - 8398
[6] Learning Dense Rewards for Contact-Rich Manipulation Tasks
Wu, Zheng
Lian, Wenzhao
Unhelkar, Vaibhav
Tomizuka, Masayoshi
Schaal, Stefan
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 6214 - 6221
[7] Implicit contact-rich manipulation planning for a manipulator with insufficient payload
Nakatsuru, Kento
Wan, Weiwei
Harada, Kensuke
[J]. ROBOTIC INTELLIGENCE AND AUTOMATION, 2023, 43 (04): : 394 - 405
[8] Improving Optimistic Exploration in Model-Free Reinforcement Learning
Grzes, Marek
Kudenko, Daniel
[J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 360 - 369
[9] Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks
Khader, Shahbaz Abdul
Yin, Hang
Falco, Pietro
Kragic, Danica
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (03): : 4321 - 4328
[10] Unified HumanRobotEnvironment Interaction Control in Contact-Rich Collaborative Manipulation Tasks via Model-Based Reinforcement Learning
Liu, Xing
Liu, Yu
Liu, Zhengxiong
Huang, Panfeng
[J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (11) : 11474 - 11482

← 1 2 3 4 5 →