Task-Oriented Self-Imitation Learning for Robotic Autonomous Skill Acquisition

被引:0
|
作者
Ran, Chenyang [1 ]
Su, Jianbo [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-imitation learning; self-evolution; episodic score; guide reward; autonomous skill acquisition;
D O I
10.1142/S0219843624500014
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The inferior sample efficiency of reinforcement learning (RL) and the requirement for high-quality demonstrations in imitation learning (IL) will hinder their application in real-world robots. To address this challenge, a novel self-evolution framework, named task-oriented self-imitation learning (TOSIL), is proposed. To circumvent external demonstrations, the top-K self-generated trajectories are chosen as expert data from both per-episode exploration and long-term return perspectives. Each transition is assigned a guide reward, which is formulated by these trajectories. The guide rewards update as the agent evolves, encouraging good exploration behaviors. This methodology guarantees that the agent explores in the direction relevant to the task, improving sample efficiency and asymptotic performance. The experimental results on locomotion and manipulation tasks indicate that the proposed framework outperforms other state-of-the-art RL methods. Furthermore, the integration of suboptimal trajectories has the potential to improve the sample efficiency while maintaining performance. This is a significant advancement in autonomous skill acquisition for robots.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control
    Xiang, Guofei
    Su, Jianbo
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1056 - 1069
  • [2] Self-Imitation Learning
    Oh, Junhyuk
    Guo, Yijie
    Singh, Satinder
    Lee, Honglak
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [3] Learning Robotic Skills via Self-Imitation and Guide Reward
    Ran, Chenyang
    Su, Jianbo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2158 - 2163
  • [4] Self-Imitation Learning by Planning
    Luo, Sha
    Kasaei, Hamidreza
    Schomaker, Lambert
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4823 - 4829
  • [5] Episodic Self-Imitation Learning with Hindsight
    Dai, Tianhong
    Liu, Hengyan
    Bharath, Anil Anthony
    [J]. ELECTRONICS, 2020, 9 (10) : 1 - 18
  • [6] Reinforcement learning in robotic motion planning by combined-based and self-imitation
    Luo, Sha
    Schomaker, Lambert
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 170
  • [7] Bridging Skill and Task-Oriented Reading
    Higgs, Karyn
    Magliano, Joseph P.
    Vidal-Abarca, Eduardo
    Martinez, Tomas
    McNamara, Danielle S.
    [J]. DISCOURSE PROCESSES, 2017, 54 (01) : 19 - 39
  • [8] Balancing Exploration and Exploitation in Self-imitation Learning
    Kang, Chun-Yao
    Chen, Ming-Syan
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 274 - 285
  • [9] Task-oriented learning on the Web
    Whittington, CD
    Campbell, LM
    [J]. INNOVATIONS IN EDUCATION AND TRAINING INTERNATIONAL, 1999, 36 (01): : 26 - 33
  • [10] Self-Imitation Learning-Based Inter-Cell Interference Coordination in Autonomous HetNets
    Yan, Mu
    Yang, Jian
    Chen, Keyu
    Sun, Yao
    Feng, Gang
    [J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (04): : 4589 - 4601