Prediction of Reward Functions for Deep Reinforcement Learning via Gaussian Process Regression

被引：22

作者：

Lim, Jaehyun ^{[1
]}

Ha, Seungchul ^{[1
]}

Choi, Jongeun ^{[1
]}

机构：

[1] Yonsei Univ, Sch Mech Engn, Seoul 03722, South Korea

来源：

IEEE-ASME TRANSACTIONS ON MECHATRONICS | 2020年 / 25卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Gaussian processes; inverse reinforcement learning; mobile robots; MOBILE; SELECTION;

D O I：

10.1109/TMECH.2020.2993564

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Inverse reinforcement learning (IRL) is a technique for automatic reward acquisition, however, it is difficult to apply to high-dimensional problems with unknown dynamics. This article proposes an efficient way to solve the IRL problem based on the sparse Gaussian process (GP) prediction with l(1)-regularization only using a highly limited number of expert demonstrations. A GP model is proposed to be trained to predict a reward function using trajectory-reward pair data generated by deep reinforcement learning with different reward functions. The trained GP successfully predicts the reward functions of human experts from their collected demonstration trajectory datasets. To demonstrate our approach, the proposed approach is applied to the obstacle avoidance navigation of the mobile robot. The experimental results clearly show that the robots can clone the experts' optimality in navigation trajectories avoiding obstacles using only with a very small number of expert demonstration datasets (e.g., <= 6). Therefore, the proposed approach shows great potential to be applied to complex real-world applications in an expert data-efficient manner.

引用

页码：1739 / 1746

页数：8

共 50 条

[21] Management and Orchestration of Virtual Network Functions via Deep Reinforcement Learning
Roig, Joan S.
Gutierrez-Estevez, David M.
Gunduz, Deniz
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2020, 38 (02) : 304 - 317
[22] Automated Assessment of Bone Age Using Deep Learning and Gaussian Process Regression
Van Steenkiste, Tom
Ruyssinck, Joeri
Janssens, Olivier
Vandersmissen, Baptist
Vandecasteele, Florian
Devolder, Pieter
Achten, Eric
Van Hoecke, Sofie
Deschrijver, Dirk
Dhaene, Tom
2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 674 - 677
[23] Barrier Functions Inspired Reward Shaping for Reinforcement Learning
Nilaksh
Ranjan, Abhishek
Agrawal, Shreenabh
Jain, Aayush
Jagtap, Pushpak
Kolathaya, Shishir
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 10807 - 10813
[24] Inverse Reinforcement Learning with Locally Consistent Reward Functions
Quoc Phong Nguyen
Low, Kian Hsiang
Jaillet, Patrick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[25] Nonstationary covariance functions for Gaussian process regression
Paciorek, CJ
Schervish, MJ
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 273 - 280
[26] Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions
Wang, Xusheng
Xie, Jiexin
Guo, Shijie
Li, Yue
Sun, Pengfei
Gan, Zhongxue
ADVANCES IN MECHANICAL ENGINEERING, 2021, 13 (12)
[27] Bearing remaining life prediction using Gaussian process regression with composite kernel functions
Hong, Sheng
Zhou, Zheng
Lu, Chen
Wang, Baoqing
Zhao, Tingdi
JOURNAL OF VIBROENGINEERING, 2015, 17 (02) : 695 - 704
[28] Bearing remaining life prediction using gaussian process regression with composite kernel functions
Science and Technology on Reliability and Environmental Engineering Laboratory, School of Reliability and System Engineering, Beihang University, Beihang, China
不详
J. Vibroeng., 2 (695-704):
[29] Variance aware reward smoothing for deep reinforcement learning
Dong, Yunlong
Zhang, Shengjun
Liu, Xing
Zhang, Yu
Shen, Tan
NEUROCOMPUTING, 2021, 458 : 327 - 335
[30] Accurate Prediction of Network Distance via Federated Deep Reinforcement Learning
Huang, Haojun
Cai, Yiming
Min, Geyong
Wang, Haozhe
Liu, Gaoyang
Wu, Dapeng Oliver
IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (04) : 3301 - 3314

← 1 2 3 4 5 →