Hierarchical Reinforcement Learning With Universal Policies for Multistep Robotic Manipulation

被引:0
|
作者
Yang, Xintong [1 ]
Ji, Ze [1 ]
Wu, Jing [2 ]
Lai, Yu-Kun [2 ]
Wei, Changyun [3 ]
Liu, Guoliang [4 ]
Setchi, Rossitza [1 ]
机构
[1] Cardiff Univ, Sch Engn, Ctr Artificial Intelligence Robot & Liuman Machin, Cardiff CF24 3AA, Wales
[2] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF24 3AA, Wales
[3] Hohai Univ, Dept Robot Engn, Changzhou 213022, Peoples R China
[4] Shandong Univ, Sch Control Sci & Engn, Jinan 250300, Peoples R China
关键词
Task analysis; Planning; Robots; Standards; Training; Reinforcement learning; Stacking; Hierarchical reinforcement learning (HRL); multistep tasks; option framework (OF); planning and control; robotic manipulation; universal policy;
D O I
10.1109/TNNLS.2021.3059912
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multistep tasks, such as block stacking or parts (dis)assembly, are complex for autonomous robotic manipulation. A robotic system for such tasks would need to hierarchically combine motion control at a lower level and symbolic planning at a higher level. Recently, reinforcement learning (RL)-based methods have been shown to handle robotic motion control with better flexibility and generalizability. However, these methods have limited capability to handle such complex tasks involving planning and control with many intermediate steps over a long time horizon. First, current RL systems cannot achieve varied outcomes by planning over intermediate steps (e.g., stacking blocks in different orders). Second, the exploration efficiency of learning multistep tasks is low, especially when rewards are sparse. To address these limitations, we develop a unified hierarchical reinforcement learning framework, named Universal Option Framework (UOF), to enable the agent to learn varied outcomes in multistep tasks. To improve learning efficiency, we train both symbolic planning and kinematic control policies in parallel, aided by two proposed techniques: 1) an auto-adjusting exploration strategy (AAES) at the low level to stabilize the parallel training, and 2) abstract demonstrations at the high level to accelerate convergence. To evaluate its performance, we performed experiments on various multistep block-stacking tasks with blocks of different shapes and combinations and with different degrees of freedom for robot control. The results demonstrate that our method can accomplish multistep manipulation tasks more efficiently and stably, and with significantly less memory consumption.
引用
收藏
页码:4727 / 4741
页数:15
相关论文
共 50 条
  • [1] Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation
    Beyret, Benjamin
    Shafti, Ali
    Faisal, A. Aldo
    [J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5014 - 5019
  • [2] Geometric Reinforcement Learning for Robotic Manipulation
    Alhousani, Naseem
    Saveriano, Matteo
    Sevinc, Ibrahim
    Abdulkuddus, Talha
    Kose, Hatice
    Abu-Dakka, Fares J.
    [J]. IEEE ACCESS, 2023, 11 : 111492 - 111505
  • [3] Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations
    Sanchez, Francisco Roldan
    Wang, Qiang
    Bulens, David Cordova
    McGuinness, Kevin
    Redmond, Stephen J.
    O'Connor, Noel E.
    [J]. 2023 9TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS, ICARA, 2023, : 160 - 164
  • [4] Hierarchical learning of robotic contact policies
    Simonic, Mihael
    Ude, Ales
    Nemec, Bojan
    [J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 86
  • [5] Composable Deep Reinforcement Learning for Robotic Manipulation
    Haarnoja, Tuomas
    Pong, Vitchyr
    Zhou, Aurick
    Dalal, Murtaza
    Abbeel, Pieter
    Levine, Sergey
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 6244 - 6251
  • [6] Latent Space Policies for Hierarchical Reinforcement Learning
    Haarnoja, Tuomas
    Hartikainen, Kristian
    Abbeel, Pieter
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning
    Chen, Hanxiao
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15769 - 15770
  • [8] A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation
    Han, Dong
    Mulyana, Beni
    Stankovic, Vladimir
    Cheng, Samuel
    [J]. SENSORS, 2023, 23 (07)
  • [9] Adversarial Manipulation of Reinforcement Learning Policies in Autonomous Agents
    Huang, Yonghong
    Wang, Shih-han
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [10] Learning positioning policies for mobile manipulation operations with deep reinforcement learning
    Ander Iriondo
    Elena Lazkano
    Ander Ansuategi
    Andoni Rivera
    Iker Lluvia
    Carlos Tubío
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 3003 - 3023