Gaussian process model based reinforcement learning

被引:1
|
作者
Yoo, Jae Hyun [1 ]
机构
[1] Hankyong National University, Korea, Republic of
关键词
Computationally efficient - Gaussian process regression - Learning control - Linear feedback controllers - Quadratic cost functions - Quadrotor unmanned aerial vehicles - Reinforcement learning control - UAV (unmanned aerial vehicle);
D O I
10.5302/J.ICROS.2019.18.0221
中图分类号
学科分类号
摘要
Reinforcement learning (RL) has been a promising approach in robotics and control because data-driven learning methods can reduce system reliance on human engineering knowledge. A model-based RL autonomously learns observed dynamics based on a general flexible nonparametric approach. Probabilistic Inference for Learning COntrol (PILCO) is one of the most data-efficient model-based RL frameworks. Since PILCO sets up a Bayesian estimator problem with a Gaussian process regression, it derives a fully deterministic approximate inference for policy evaluation, which makes it computationally efficient. However, PILCO requires a task-specific scenario. If an agent is given a new goal that is different than the original training goal, PILCO should relearn its model from scratch. This paper extends PILCO to tune a linear feedback controller with a quadratic cost function, where the quadratic cost function commonly used in control systems can adjust the trade-off relationship between control input consumption and convergence rate. The suggested method is not only able to maintain the analytic and deterministic approximate inference for policy evaluation, but is also able to interpret the controller design. The suggested RL framework is applied to the control of a small quadrotor unmanned aerial vehicle (UAV) with no given dynamics. The simulation results show the convergence of the learning control performance as a function of the number of RL iterations. © ICROS 2019.
引用
收藏
页码:746 / 751
相关论文
共 50 条
  • [1] Inverse Reinforcement Learning with Gaussian Process
    Qiao, Qifeng
    Beling, Peter A.
    [J]. 2011 AMERICAN CONTROL CONFERENCE, 2011, : 113 - 118
  • [2] Reinforcement learning for continuous spaces based on Gaussian process classifier
    Wang, Xue-Song
    Zhang, Yi-Yang
    Cheng, Yu-Hu
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2009, 37 (06): : 1153 - 1158
  • [3] Reinforcement Learning with a Gaussian Mixture Model
    Agostini, Alejandro
    Celaya, Enric
    [J]. 2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [4] Reinforcement Learning Based Aircraft Controller Enhanced By Gaussian Process Trim Finding
    Benyamen, Hady
    Chowdhury, Mozammal
    Keshmiri, Shawn
    [J]. ASME Letters in Dynamic Systems and Control, 2023, 3 (03):
  • [5] Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes
    Airaldi, Filippo
    De Schutter, Bart
    Dabiri, Azita
    [J]. IFAC PAPERSONLINE, 2023, 56 (02): : 5759 - 5764
  • [6] Inverse Reinforcement Learning via Deep Gaussian Process
    Jin, Ming
    Damianou, Andreas
    Abbeel, Pieter
    Spanos, Costas
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [7] Gaussian Process Based Model Predictive Controller for Imitation Learning
    Joukov, Vladimir
    Kulic, Dana
    [J]. 2017 IEEE-RAS 17TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTICS (HUMANOIDS), 2017, : 850 - 855
  • [8] Learning from demonstration with model-based Gaussian process
    Jaquier, Noemie
    Ginsbourger, David
    Calinon, Sylvain
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [9] Cloud Job Scheduling Control Scheme Based on Gaussian Process Regression and Reinforcement Learning
    Peng, Zhiping
    Cui, Delong
    Xiong, Jianbin
    Xu, Bo
    Ma, Yuanjia
    Lin, Weiwei
    [J]. 2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD 2016), 2016, : 278 - 286
  • [10] Nonlinear Inverse Reinforcement Learning with Mutual Information and Gaussian Process
    Li, De C.
    He, Yu Q.
    Fu, Feng
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS IEEE-ROBIO 2014, 2014, : 1445 - 1450