Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control

被引:30
|
作者
Xiang, Guofei [1 ]
Su, Jianbo [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Key Lab Syst Control & Informat Proc, Minist Educ, Shanghai 200240, Peoples R China
关键词
Continuous control; deep neural networks (DNNs); exploration; imitation learning (IL); reinforcement learning (RL); robotics; skill acquisition; SEARCH;
D O I
10.1109/TCYB.2019.2949596
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) and imitation learning (IL), especially equipped with deep neural networks, have been widely studied for autonomous robotic skill acquisition and control tasks. However, these methods and their extensions require extensive environmental interactions during training, which greatly prevents them from being applied to real-world robots. To alleviate this problem, we present an efficient model-free off-policy actor-critic algorithm for robotic skill acquisition and continuous control, by fusing the task reward with a task-oriented guiding reward, which is formulated by leveraging few and imperfect expert demonstrations. In this framework, the agent can explore the environment more intentionally, thus sampling efficiency can be achieved; moreover, the agent can also exploit the experience more effectively, thereby substantially improved performance can be realized simultaneously. The empirical results on robotic locomotion tasks show that the proposed scheme can lower sample complexity by 2-10 times in contrast with the state-of-the-art baseline deep RL (DRL) algorithms, while achieving performance better than that of the expert. Furthermore, the proposed algorithm achieves significant improvement in both sampling efficiency and asymptotic performance on tasks with sparse and delayed reward, wherein those baseline DRL algorithms struggle to make progress. This takes a substantial step forward to implement these methods to acquire skills autonomously for real robots.
引用
收藏
页码:1056 / 1069
页数:14
相关论文
共 50 条
  • [1] Task-Oriented Self-Imitation Learning for Robotic Autonomous Skill Acquisition
    Ran, Chenyang
    Su, Jianbo
    [J]. INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2024,
  • [2] Manipulation Skill Acquisition for Robotic Assembly using Deep Reinforcement Learning
    Li, Fengming
    Jiang, Qi
    Quan, Wei
    Song, Rui
    Li, Yibin
    [J]. 2019 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2019, : 13 - 18
  • [3] A Task-oriented Chatbot Based on LSTM and Reinforcement Learning
    Hsueh, Yu-Ling
    Chou, Tai-Liang
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (01)
  • [4] A Task-oriented Chatbot Based on LSTM and Reinforcement Learning
    Chou, Tai-Liang
    Hsueh, Yu-Ling
    [J]. NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 87 - 91
  • [5] Task-oriented Dialogue System Based on Reinforcement Learning
    Song, Meina
    Chen, Zhongfu
    Niu, Peiqing
    Haihong, E.
    [J]. PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 93 - 98
  • [6] Deep Reinforcement Learning Based Task-Oriented Communication in Multi-Agent Systems
    He, Guojun
    Feng, Mingjie
    Zhang, Yu
    Liu, Guanghua
    Dai, Yueyue
    Jiang, Tao
    [J]. IEEE WIRELESS COMMUNICATIONS, 2023, 30 (03) : 112 - 119
  • [7] Bridging Skill and Task-Oriented Reading
    Higgs, Karyn
    Magliano, Joseph P.
    Vidal-Abarca, Eduardo
    Martinez, Tomas
    McNamara, Danielle S.
    [J]. DISCOURSE PROCESSES, 2017, 54 (01) : 19 - 39
  • [8] Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems
    Li, Ziming
    Kiseleva, Julia
    de Rijke, Maarten
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [9] A Survey of Task-Oriented Dialogue Policies Based on Reinforcement Learning
    Xu, Kai
    Wang, Zhen-Yu
    Wang, Xu
    Qin, Hua
    Long, Yu-Xuan
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (06): : 1201 - 1231
  • [10] Task-oriented reinforcement learning for continuous tasks in dynamic environment
    Kamal, MAS
    Murata, J
    Hirasawa, K
    [J]. SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, 2002, : 829 - 832