Hierarchical reinforcement learning with adaptive scheduling for robot control

被引：4

作者：

Huang, Zhigang ^{[1
]}

Liu, Quan ^{[1
]}

Zhu, Fei ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 126卷

基金：

中国国家自然科学基金;

关键词：

Hierarchical reinforcement learning; Exploration and exploitation; Scheduling; Sparse reward;

D O I：

10.1016/j.engappai.2023.107130

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Conventional hierarchical reinforcement learning (HRL) relies on discrete options to represent explicitly distinguishable knowledge, which may lead to severe performance bottlenecks. It is possible to represent richer knowledge through continuous options, but reliable scheduling methods are lacking. To design an available scheduling method for continuous options, in this paper, the hierarchical reinforcement learning with adaptive scheduling (HAS) algorithm is proposed. Its low-level controller learns diverse options, while the high-level controller schedules options to learn solutions. It achieves an adaptive balance between exploration and exploitation during the frequent scheduling of continuous options, maximizing the representation potential of continuous options. It builds on multi-step static scheduling and makes switching decisions according to the relative advantages of the previous and the estimated continuous options, enabling the agent to focus on different behaviors at different phases of the task. The expected t-step distance is applied to demonstrate the superiority of adaptive scheduling in terms of exploration. Furthermore, an interruption incentive based on annealing is proposed to alleviate excessive exploration during the early training phase, accelerating the convergence rate. Finally, we apply HAS to robot control with sparse rewards in continuous spaces, and develop a comprehensive experimental analysis scheme. The experimental results not only demonstrate the high performance and robustness of HAS, but also provide evidence that the adaptive scheduling method has a positive effect both on the representation and option policies.

引用

页数：15

共 50 条

[1] Reinforcement learning for robot control
Smart, WD
Kaelbling, LP
[J]. MOBILE ROBOTS XVI, 2002, 4573 : 92 - 103
[2] Adaptive predictive control of a differential drive robot tuned with reinforcement learning
Jardine, P. Travis
Kogan, Michael
Givigi, Sidney N.
Yousefi, Shahram
[J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2019, 33 (02) : 410 - 423
[3] Adaptive Mobile Robot Scheduling in Multiproduct Flexible Manufacturing Systems Using Reinforcement Learning
Waseem, Muhammad
Chang, Qing
[J]. JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING-TRANSACTIONS OF THE ASME, 2023, 145 (12):
[4] Adaptive neural control using reinforcement learning for a class of robot manipulator
Li Tang
Yan-Jun Liu
Shaocheng Tong
[J]. Neural Computing and Applications, 2014, 25 : 135 - 141
[5] Adaptive neural control using reinforcement learning for a class of robot manipulator
Tang, Li
Liu, Yan-Jun
Tong, Shaocheng
[J]. NEURAL COMPUTING & APPLICATIONS, 2014, 25 (01): : 135 - 141
[6] Adaptive neural network control of robot manipulator using reinforcement learning
Tang, Li
Liu, Yan-Jun
[J]. JOURNAL OF VIBRATION AND CONTROL, 2014, 20 (14) : 2162 - 2171
[7] Locomotion Control Method for Humanoid Robot Based on United Hierarchical Reinforcement Learning
Liu, Boying
Ma, Lu
Liu, Chenju
Xu, BinChen
[J]. 2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 1161 - 1166
[8] Adaptive Skill Acquisition in Hierarchical Reinforcement Learning
Holas, Juraj
Farkas, Igor
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 383 - 394
[9] Reinforcement Learning for the Adaptive Scheduling of Educational Activities
Bassen, Jonathan
Balaji, Bharathan
Schaarschmidt, Michael
Thille, Candace
Painter, Jay
Zimmaro, Dawn
Gamest, Alex
Fast, Ethan
Mitchell, John C.
[J]. PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
[10] A Hierarchical Resource Scheduling Method for Satellite Control System Based on Deep Reinforcement Learning
Li, Yang
Guo, Xiye
Meng, Zhijun
Qin, Junxiang
Li, Xuan
Ma, Xiaotian
Ren, Sichuang
Yang, Jun
[J]. ELECTRONICS, 2023, 12 (19)

← 1 2 3 4 5 →