Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards

被引:8
|
作者
Zuo, Guoyu [1 ,2 ]
Zhao, Qishen [1 ,2 ]
Lu, Jiahao [1 ,2 ]
Li, Jiangeng [1 ,2 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[2] Beijing Key Lab Comp Intelligence & Intelligent S, Beijing, Peoples R China
来源
基金
北京市自然科学基金; 美国国家科学基金会;
关键词
Robot learning; reinforcement learning; sparse reward; CAHER; demonstrations;
D O I
10.1177/1729881419898342
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The goal of reinforcement learning is to enable an agent to learn by using rewards. However, some robotic tasks naturally specify with sparse rewards, and manually shaping reward functions is a difficult project. In this article, we propose a general and model-free approach for reinforcement learning to learn robotic tasks with sparse rewards. First, a variant of Hindsight Experience Replay, Curious and Aggressive Hindsight Experience Replay, is proposed to improve the sample efficiency of reinforcement learning methods and avoid the need for complicated reward engineering. Second, based on Twin Delayed Deep Deterministic policy gradient algorithm, demonstrations are leveraged to overcome the exploration problem and speed up the policy training process. Finally, the action loss is added into the loss function in order to minimize the vibration of output action while maximizing the value of the action. The experiments on simulated robotic tasks are performed with different hyperparameters to verify the effectiveness of our method. Results show that our method can effectively solve the sparse reward problem and obtain a high learning speed.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Relay Hindsight Experience Replay: Self-guided continual reinforcement learning for sequential object manipulation tasks with sparse rewards
    Luo, Yongle
    Wang, Yuxin
    Dong, Kun
    Zhang, Qiang
    Cheng, Erkang
    Sun, Zhiyong
    Song, Bo
    [J]. NEUROCOMPUTING, 2023, 557
  • [2] Shaping Rewards for Reinforcement Learning with Imperfect Demonstrations using Generative Models
    Wu, Yuchen
    Mozifian, Melissa
    Shkurti, Florian
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 6628 - 6634
  • [3] Multimodal fusion for autonomous navigation via deep reinforcement learning with sparse rewards and hindsight experience replay
    Xiao, Wendong
    Yuan, Liang
    Ran, Teng
    He, Li
    Zhang, Jianbo
    Cui, Jianping
    [J]. DISPLAYS, 2023, 78
  • [4] Intermittent Reinforcement Learning with Sparse Rewards
    Sahoo, Prachi Pratyusha
    Vamvoudakis, Kyriakos G.
    [J]. 2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2709 - 2714
  • [5] Efficient Policy Learning for General Robotic Tasks with Adaptive Dual-memory Hindsight Experience Replay Based on Deep Reinforcement Learning
    Dong, Menghua
    Ying, Fengkang
    Li, Xiangjian
    Liu, Huashan
    [J]. 2023 7TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION, ICRCA, 2023, : 62 - 66
  • [6] Reinforcement learning for robotic manipulation using simulated locomotion demonstrations
    Ozsel Kilinc
    Giovanni Montana
    [J]. Machine Learning, 2022, 111 : 465 - 486
  • [7] Reinforcement learning for robotic manipulation using simulated locomotion demonstrations
    Kilinc, Ozsel
    Montana, Giovanni
    [J]. MACHINE LEARNING, 2022, 111 (02) : 465 - 486
  • [8] Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments
    Rengarajan, Desik
    Chaudhary, Sapana
    Kim, Jaewon
    Kalathil, Dileep
    Shakkottai, Srinivas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Deep Reinforcement Learning for an Anthropomorphic Robotic Arm Under Sparse Reward Tasks
    Cheng, Hao
    Duan, Feng
    Zheng, Haosi
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT II, 2021, 13014 : 79 - 89
  • [10] Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain
    Jingyu Cao
    Lu Dong
    Xin Yuan
    Yuanda Wang
    Changyin Sun
    [J]. Neural Computing and Applications, 2024, 36 : 273 - 287