Gaussian process model based reinforcement learning

被引:1
|
作者
Yoo J.H. [1 ]
机构
[1] Hankyong National University, Korea, Republic of
关键词
Gaussian Process Regression (GPR); PILCO (Probabilistic Inference for Learning Control); Reinforcement learning control system; UAV (Unmanned Aerial Vehicle);
D O I
10.5302/J.ICROS.2019.18.0221
中图分类号
学科分类号
摘要
Reinforcement learning (RL) has been a promising approach in robotics and control because data-driven learning methods can reduce system reliance on human engineering knowledge. A model-based RL autonomously learns observed dynamics based on a general flexible nonparametric approach. Probabilistic Inference for Learning COntrol (PILCO) is one of the most data-efficient model-based RL frameworks. Since PILCO sets up a Bayesian estimator problem with a Gaussian process regression, it derives a fully deterministic approximate inference for policy evaluation, which makes it computationally efficient. However, PILCO requires a task-specific scenario. If an agent is given a new goal that is different than the original training goal, PILCO should relearn its model from scratch. This paper extends PILCO to tune a linear feedback controller with a quadratic cost function, where the quadratic cost function commonly used in control systems can adjust the trade-off relationship between control input consumption and convergence rate. The suggested method is not only able to maintain the analytic and deterministic approximate inference for policy evaluation, but is also able to interpret the controller design. The suggested RL framework is applied to the control of a small quadrotor unmanned aerial vehicle (UAV) with no given dynamics. The simulation results show the convergence of the learning control performance as a function of the number of RL iterations. © ICROS 2019.
引用
收藏
页码:746 / 751
页数:5
相关论文
共 50 条
  • [31] Robust Model-Based Reinforcement Learning Control of a Batch Crystallization Process
    Benyahia, B.
    Anandan, P. D.
    Rielly, C.
    2021 9TH INTERNATIONAL CONFERENCE ON SYSTEMS AND CONTROL (ICSC'21), 2021, : 89 - 94
  • [32] Reinforcement learning with Gaussian processes for condition-based maintenance
    Peng, Shenglin
    Feng, Qianmei
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 158
  • [33] Prediction of Reward Functions for Deep Reinforcement Learning via Gaussian Process Regression
    Lim, Jaehyun
    Ha, Seungchul
    Choi, Jongeun
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2020, 25 (04) : 1739 - 1746
  • [34] Probabilistic prediction model for critical chloride concentration of reinforcement corrosion based on improved Gaussian process regression
    Zhou, Huanyu
    Wang, Zizhen
    Chen, Xiaojie
    Yu, Bo
    MAGAZINE OF CONCRETE RESEARCH, 2024,
  • [35] Active Learning in Gaussian Process State Space Model
    Yu, Hon Sum Alec
    Yao, Dingling
    Zimmer, Christoph
    Toussaint, Marc
    Duy Nguyen-Tuong
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 : 346 - 361
  • [36] The Multiple Instance Learning Gaussian Process Probit Model
    Wang, Fulton
    Pinar, Ali
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [37] Constrained Gaussian Process Learning for Model Predictive Control
    Matschek, Janine
    Himmel, Andreas
    Sundmacher, Kai
    Findeisen, Rolf
    IFAC PAPERSONLINE, 2020, 53 (02): : 971 - 976
  • [38] Supervised Gaussian Process Latent Variable Model Based on Gaussian Mixture Model
    Zhang, Jiayuan
    Zhu, Ziqi
    Zou, Jixin
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 124 - 129
  • [39] Simulation of English teaching quality evaluation model based on gaussian process machine learning
    Huang Wenming
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (02) : 2373 - 2383
  • [40] Enhanced Gaussian Process Regression for Active Learning Model-based Predictive Control
    Ren, Rui
    Li, Shaoyuan
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 2731 - 2736