Structured Online Learning-based Control of Continuous-time Nonlinear Systems

被引:4
|
作者
Farsi, Milad [1 ]
Liu, Jun [1 ]
机构
[1] Univ Waterloo, Appl Math Dept, Waterloo, ON, Canada
来源
IFAC PAPERSONLINE | 2020年 / 53卷 / 02期
基金
加拿大自然科学与工程研究理事会;
关键词
Reinforcement learning; Model-based learning; Optimal control; Feedback control; Continuous-time control; Adaptive dynamic programming; Sparse identification;
D O I
10.1016/j.ifacol.2020.12.2299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model-based reinforcement learning techniques accelerate the learning task by employing a transition model to make predictions. In this paper, a model-based learning approach is presented that iteratively computes the optimal value function based on the most recent update of the model. Assuming a structured continuous-time model of the system in terms of a set of bases, we formulate an infinite horizon optimal control problem addressing a given control objective. The structure of the system along with a value function parameterized in the quadratic form provides a flexibility in analytically calculating an update rule for the parameters. Hence, a matrix differential equation of the parameters is obtained, where the solution is used to characterize the optimal feedback control in terms of the bases, at any time step. Moreover, the quadratic form of the value function suggests a compact way of updating the parameters that considerably decreases the computational complexity. Considering the state-dependency of the differential equation, we exploit the obtained framework as an online learning-based algorithm. In the numerical results, the presented algorithm is implemented on four nonlinear benchmark examples, where the regulation problem is successfully solved while an identified model of the system is obtained with a bounded prediction error. Copyright (C) 2020 The Authors.
引用
收藏
页码:8142 / 8149
页数:8
相关论文
共 50 条
  • [31] Inverse optimal control for deterministic continuous-time nonlinear systems
    Johnson, Miles
    Aghasadeghi, Navid
    Bretl, Timothy
    2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 2906 - 2913
  • [32] Control of continuous-time nonlinear systems using neural networks
    He, SL
    Reif, K
    Unbehauen, R
    INTERNATIONAL WORKSHOP ON NEURAL NETWORKS FOR IDENTIFICATION, CONTROL, ROBOTICS, AND SIGNAL/IMAGE PROCESSING - PROCEEDINGS, 1996, : 402 - 409
  • [33] Tracking control of nonlinear lumped mechanical continuous-time systems: A model-based iterative learning approach
    Smolders, K.
    Volckaert, M.
    Swevers, J.
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2008, 22 (08) : 1896 - 1916
  • [34] Actuator fault tolerant control in nonlinear continuous-time systems
    Jiang, Bin
    Staroswiecki, Marcel
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 5483 - +
  • [35] Data-Based Self-Learning Optimal Control for Continuous-Time Unknown Nonlinear Systems With Disturbance
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    Yan, Pengfei
    PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 6633 - 6638
  • [36] Stochastic Sampling Control for A Class of Nonlinear Continuous-time Systems
    Fan, Xing
    Jia, Xinchun
    Chi, Xiaobo
    Wang, Xiaokai
    2010 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-5, 2010, : 3140 - +
  • [37] Online reinforcement learning for a class of partially unknown continuous-time nonlinear systems via value iteration
    Su, Hanguang
    Zhang, Huaguang
    Zhang, Kun
    Gao, Wenzhong
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2018, 39 (02): : 1011 - 1028
  • [38] Neuro-Control for Continuous-Time Stochastic Nonlinear Systems via Online Policy Iteration Algorithm
    Zhou, Tianmin
    Hou, Jiaxu
    Li, Handong
    Di, Zengru
    Zhao, Bo
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 1499 - 1503
  • [39] An Online Actor/Critic Algorithm for Event-Triggered Optimal Control of Continuous-Time Nonlinear Systems
    Vamvoudakis, Kyriakos G.
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 1 - 6
  • [40] Constrained Online Optimal Control for Continuous-Time Nonlinear Systems Using Neuro-Dynamic Programming
    Yang Xiong
    Liu Derong
    Wang Ding
    Ma Hongwen
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 8717 - 8722