Planning Approximate Exploration Trajectories for Model-Free Reinforcement Learning in Contact-Rich Manipulation

被引:16
|
作者
Hoppe, Sabrina [1 ,2 ]
Lou, Zhongyu [1 ]
Hennes, Daniel [2 ]
Toussaint, Marc [2 ]
机构
[1] Bosch Corp Res, D-71272 Stuttgart, Germany
[2] Univ Stuttgart, Machine Learning & Robot Lab, D-70174 Stuttgart, Germany
来源
关键词
Deep learning in robotics and automation; dexterous manipulation; learning and adaptive systems;
D O I
10.1109/LRA.2019.2928212
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recent progress in deep reinforcement learning has enabled simulated agents to learn complex behavior policies from scratch, but their data complexity often prohibits real-world applications. The learning process can be sped up by expert demonstrations but those can be costly to acquire. We demonstrate that it is possible to employ model-free deep reinforcement learning combined with planning to quickly generate informative data for a manipulation task. In particular, we use an approximate trajectory optimization approach for global exploration based on an upper confidence bound of the advantage function. The advantage is approximated by a network for Q-learning with separately updated streams for state value and advantage that allows ensembles to approximate model uncertainty for one stream only. We evaluate our method on new extensions to the classical peg-in-hole task, one of which is only solvable by active usage of contacts between peg tips and holes. The experimental evaluation suggests that our method explores more relevant areas of the environment and finds exemplar solutions faster-both on a real robot and in simulation. Combining our exploration with learning from demonstration outperforms state-of-the-art model-free reinforcement learning in terms of convergence speed for contact-rich manipulation tasks.
引用
收藏
页码:4042 / 4047
页数:6
相关论文
共 50 条
  • [31] Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations
    Ablett, Trevor
    Zhai, Yifan
    Kelly, Jonathan
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 7843 - 7850
  • [32] Learning Force Control for Contact-Rich Manipulation Tasks With Rigid Position-Controlled Robots
    Beltran-Hernandez, Cristian Camilo
    Petit, Damien
    Ramirez-Alpizar, Ixchel Georgina
    Nishi, Takayuki
    Kikuchi, Shinichi
    Matsubara, Takamitsu
    Harada, Kensuke
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04): : 5709 - 5716
  • [33] Augmentation Enables One-Shot Generalization In Learning From Demonstration for Contact-Rich Manipulation
    Li, Xing
    Baum, Manuel
    Brock, Oliver
    [J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3656 - 3663
  • [34] Contact-Rich Manipulation of a Flexible Object based on Deep Predictive Learning using Vision and Tactility
    Ichiwara, Hideyuki
    Ito, Hiroshi
    Yamamoto, Kenjiro
    Mori, Hiroki
    Ogata, Tetsuya
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 5375 - 5381
  • [35] Model-free incremental learning of the semantics of manipulation actions
    Aksoy, Eren Erdal
    Tamosiunaite, Minija
    Woergoetter, Florentin
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2015, 71 : 118 - 133
  • [36] Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey
    Liu, Yongshuai
    Halev, Avishai
    Liu, Xin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4508 - 4515
  • [37] ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning
    Gimelfarb, Michael
    Sanner, Scott
    Lee, Chi-Guhn
    [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 476 - 485
  • [38] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [39] Constrained model-free reinforcement learning for process optimization
    Pan, Elton
    Petsagkourakis, Panagiotis
    Mowbray, Max
    Zhang, Dongda
    del Rio-Chanona, Ehecatl Antonio
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2021, 154
  • [40] Model-Free μ Synthesis via Adversarial Reinforcement Learning
    Keivan, Darioush
    Havens, Aaron
    Seiler, Peter
    Dullerud, Geir
    Hu, Bin
    [J]. 2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3335 - 3341