Hierarchical Reinforcement Learning With Universal Policies for Multistep Robotic Manipulation

被引：0

作者：

Yang, Xintong ^{[1
]}

Ji, Ze ^{[1
]}

Wu, Jing ^{[2
]}

Lai, Yu-Kun ^{[2
]}

Wei, Changyun ^{[3
]}

Liu, Guoliang ^{[4
]}

Setchi, Rossitza ^{[1
]}

机构：

[1] Cardiff Univ, Sch Engn, Ctr Artificial Intelligence Robot & Liuman Machin, Cardiff CF24 3AA, Wales

[2] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF24 3AA, Wales

[3] Hohai Univ, Dept Robot Engn, Changzhou 213022, Peoples R China

[4] Shandong Univ, Sch Control Sci & Engn, Jinan 250300, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2022年 / 33卷 / 09期

关键词：

Task analysis; Planning; Robots; Standards; Training; Reinforcement learning; Stacking; Hierarchical reinforcement learning (HRL); multistep tasks; option framework (OF); planning and control; robotic manipulation; universal policy;

D O I：

10.1109/TNNLS.2021.3059912

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multistep tasks, such as block stacking or parts (dis)assembly, are complex for autonomous robotic manipulation. A robotic system for such tasks would need to hierarchically combine motion control at a lower level and symbolic planning at a higher level. Recently, reinforcement learning (RL)-based methods have been shown to handle robotic motion control with better flexibility and generalizability. However, these methods have limited capability to handle such complex tasks involving planning and control with many intermediate steps over a long time horizon. First, current RL systems cannot achieve varied outcomes by planning over intermediate steps (e.g., stacking blocks in different orders). Second, the exploration efficiency of learning multistep tasks is low, especially when rewards are sparse. To address these limitations, we develop a unified hierarchical reinforcement learning framework, named Universal Option Framework (UOF), to enable the agent to learn varied outcomes in multistep tasks. To improve learning efficiency, we train both symbolic planning and kinematic control policies in parallel, aided by two proposed techniques: 1) an auto-adjusting exploration strategy (AAES) at the low level to stabilize the parallel training, and 2) abstract demonstrations at the high level to accelerate convergence. To evaluate its performance, we performed experiments on various multistep block-stacking tasks with blocks of different shapes and combinations and with different degrees of freedom for robot control. The results demonstrate that our method can accomplish multistep manipulation tasks more efficiently and stably, and with significantly less memory consumption.

引用

页码：4727 / 4741

页数：15

共 50 条

[1] Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation
Beyret, Benjamin
Shafti, Ali
Faisal, A. Aldo
[J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5014 - 5019
[2] Geometric Reinforcement Learning for Robotic Manipulation
Alhousani, Naseem
Saveriano, Matteo
Sevinc, Ibrahim
Abdulkuddus, Talha
Kose, Hatice
Abu-Dakka, Fares J.
[J]. IEEE ACCESS, 2023, 11 : 111492 - 111505
[3] Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations
Sanchez, Francisco Roldan
Wang, Qiang
Bulens, David Cordova
McGuinness, Kevin
Redmond, Stephen J.
O'Connor, Noel E.
[J]. 2023 9TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS, ICARA, 2023, : 160 - 164
[4] Hierarchical learning of robotic contact policies
Simonic, Mihael
Ude, Ales
Nemec, Bojan
[J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 86
[5] Composable Deep Reinforcement Learning for Robotic Manipulation
Haarnoja, Tuomas
Pong, Vitchyr
Zhou, Aurick
Dalal, Murtaza
Abbeel, Pieter
Levine, Sergey
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 6244 - 6251
[6] Latent Space Policies for Hierarchical Reinforcement Learning
Haarnoja, Tuomas
Hartikainen, Kristian
Abbeel, Pieter
Levine, Sergey
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[7] Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning
Chen, Hanxiao
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15769 - 15770
[8] A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation
Han, Dong
Mulyana, Beni
Stankovic, Vladimir
Cheng, Samuel
[J]. SENSORS, 2023, 23 (07)
[9] Adversarial Manipulation of Reinforcement Learning Policies in Autonomous Agents
Huang, Yonghong
Wang, Shih-han
[J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
[10] Learning positioning policies for mobile manipulation operations with deep reinforcement learning
Ander Iriondo
Elena Lazkano
Ander Ansuategi
Andoni Rivera
Iker Lluvia
Carlos Tubío
[J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 3003 - 3023

← 1 2 3 4 5 →