Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning

被引:4
|
作者
Jin, Jun [1 ,2 ]
Graves, Daniel [1 ]
Haigh, Cameron [1 ]
Luo, Jun [1 ]
Jagersand, Martin [2 ]
机构
[1] Huawei Technol Canada Ltd, Noahs Ark Lab, Edmonton, AB, Canada
[2] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
关键词
CORTEX;
D O I
10.1109/ICRA46639.2022.9811963
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider real-world reinforcement learning (RL) of robotic manipulation tasks that involve both visuomotor skills and contact-rich skills. We aim to train a policy that maps multimodal sensory observations (vision and force) to a manipulator's joint velocities under practical considerations. We propose to use offline samples to learn a set of general value functions (GVFs) that make counterfactual predictions from the visual inputs. We show that combining the offline learned counterfactual predictions with force feedbacks in online policy learning allows efficient reinforcement learning given only a terminal (success/failure) reward. We argue that the learned counterfactual predictions form a compact and informative representation that enables sample efficiency and provides auxiliary reward signals that guide online explorations towards contact-rich states. Various experiments in simulation and real-world settings were performed for evaluation. Recordings of the real-world robot training can be found via https://sites.google.com/view/realrl.
引用
收藏
页码:3616 / 3623
页数:8
相关论文
共 50 条
  • [1] NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
    Qin, Rong-Jun
    Zhang, Xingyuan
    Gao, Songyi
    Chen, Xiong-Hui
    Li, Zewen
    Zhang, Weinan
    Yu, Yang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications
    Nambiar, Mila
    Ghosh, Supriyo
    Ong, Priscilla
    Chan, Yu En
    Bee, Yong Mong
    Krishnaswamy, Pavitra
    [J]. PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4673 - 4684
  • [3] Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning
    Lutter, Michael
    Silberbauer, Johannes
    Watson, Joe
    Peters, Jan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4163 - 4170
  • [4] Real World Offline Reinforcement Learning with Realistic Data Source
    Zhou, Gaoyue
    Ke, Liyiming
    Srinivasa, Siddhartha
    Gupta, Abhinav
    Rajeswaran, Aravind
    Kumar, Vikash
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 7176 - 7183
  • [5] Real-world humanoid locomotion with reinforcement learning
    Radosavovic, Ilija
    Xiao, Tete
    Zhang, Bike
    Darrell, Trevor
    Malik, Jitendra
    Sreenath, Koushil
    [J]. SCIENCE ROBOTICS, 2024, 9 (89)
  • [6] Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
    Kreutzer, Julia
    Riezler, Stefan
    Lawrence, Carolin
    [J]. SPNLP 2021: THE 5TH WORKSHOP ON STRUCTURED PREDICTION FOR NLP, 2021, : 37 - 43
  • [7] Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning
    Shao, Jianzhun
    Qu, Yun
    Chen, Chen
    Zhang, Hongchang
    Ji, Xiangyang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Offline Reinforcement Learning for Autonomous Driving with Real World Driving Data
    Fang, Xing
    Zhang, Qichao
    Gao, Yinfeng
    Zhao, Dongbin
    [J]. 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 3417 - 3422
  • [9] Reinforcement Learning in Robotics: Applications and Real-World Challenges
    Kormushev, Petar
    Calinon, Sylvain
    Caldwell, Darwin G.
    [J]. ROBOTICS, 2013, 2 (03): : 122 - 148
  • [10] Real-World Reinforcement Learning via Multifidelity Simulators
    Cutler, Mark
    Walsh, Thomas J.
    How, Jonathan P.
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2015, 31 (03) : 655 - 671