End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer

被引:30
|
作者
Yuan, Weihao [1 ]
Hang, Kaiyu [2 ]
Kragic, Danica [3 ]
Wang, Michael Y. [1 ]
Stork, Johannes A. [4 ]
机构
[1] Hong Kong Univ Sci & Technol, ECE, Robot Inst, Hong Kong, Peoples R China
[2] Yale Univ, Mech Engn & Mat Sci, New Haven, CT USA
[3] KTH Royal Inst Technol, EECS, Ctr Autonomous Syst, Stockholm, Sweden
[4] Orebro Univ, Ctr Appl Autonomous Sensor Syst, Orebro, Sweden
关键词
Nonprehensile rearrangement; Deep reinforcement learning; Transfer learning;
D O I
10.1016/j.robot.2019.06.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nonprehensile rearrangement is the problem of controlling a robot to interact with objects through pushing actions in order to reconfigure the objects into a predefined goal pose. In this work, we rearrange one object at a time in an environment with obstacles using an end-to-end policy that maps raw pixels as visual input to control actions without any form of engineered feature extraction. To reduce the amount of training data that needs to be collected using a real robot, we propose a simulation-to-reality transfer approach. In the first step, we model the nonprehensile rearrangement task in simulation and use deep reinforcement learning to learn a suitable rearrangement policy, which requires in the order of hundreds of thousands of example actions for training. Thereafter, we collect a small dataset of only 70 episodes of real-world actions as supervised examples for adapting the learned rearrangement policy to real-world input data. In this process, we make use of newly proposed strategies for improving the reinforcement learning process, such as heuristic exploration and the curation of a balanced set of experiences. We evaluate our method in both simulation and real setting using a Baxter robot to show that the proposed approach can effectively improve the training process in simulation, as well as efficiently adapt the learned policy to the real world application, even when the camera pose is different from simulation. Additionally, we show that the learned system not only can provide adaptive behavior to handle unforeseen events during executions, such as distraction objects, sudden changes in positions of the objects, and obstacles, but also can deal with obstacle shapes that were not present in the training process. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:119 / 134
页数:16
相关论文
共 50 条
  • [1] End-to-End Deep Reinforcement Learning for Exoskeleton Control
    Rose, Lowell
    Bazzocchi, Michael C. F.
    Nejat, Goldie
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4294 - 4301
  • [2] NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning
    Haj-Ali, Ameer
    Ahmed, Nesreen K.
    Willke, Ted
    Shao, Yakun Sophia
    Asanovic, Krste
    Stoica, Ion
    [J]. CGO'20: PROCEEDINGS OF THE18TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2020, : 242 - 255
  • [3] End-to-End Race Driving with Deep Reinforcement Learning
    Jaritz, Maximilian
    de Charette, Raoul
    Toromanoff, Marin
    Perot, Etienne
    Nashashibi, Fawzi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 2070 - 2075
  • [4] End-to-end Control of Kart Agent with Deep Reinforcement Learning
    Zhang Ruiming
    Liu Chengju
    Chen Qijun
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 1688 - 1693
  • [5] End-to-end Deep Reinforcement Learning Based Coreference Resolution
    Fei, Hongliang
    Li, Xu
    Li, Dingcheng
    Li, Ping
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 660 - 665
  • [6] Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning
    Yuan, Weihao
    Stork, Johannes A.
    Kragic, Danica
    Wang, Michael Y.
    Hang, Kaiyu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 270 - 277
  • [7] End-to-End Autonomous Driving Decision Based on Deep Reinforcement Learning
    Huang, Zhiqing
    Zhang, Ji
    Tian, Rui
    Zhang, Yanxin
    [J]. CONFERENCE PROCEEDINGS OF 2019 5TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2019, : 658 - 662
  • [8] Optimization of Neuroprosthetic Vision via End-to-End Deep Reinforcement Learning
    Kucukoglu, Burcu
    Rueckauer, Bodo
    Ahmad, Nasir
    van Steveninck, Jaap de Ruyter
    Guclu, Umut
    van Gerven, Marcel
    [J]. INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2022, 32 (11)
  • [9] End-to-End Autonomous Exploration with Deep Reinforcement Learning and Intrinsic Motivation
    Ruan, Xiaogang
    Li, Peng
    Zhu, Xiaoqing
    Yu, Hejie
    Yu, Naigong
    [J]. Computational Intelligence and Neuroscience, 2021, 2021
  • [10] End-to-End Autonomous Exploration with Deep Reinforcement Learning and Intrinsic Motivation
    Ruan, Xiaogang
    Li, Peng
    Zhu, Xiaoqing
    Yu, Hejie
    Yu, Naigong
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021