Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances

被引:55
|
作者
Xu, Xin [1 ]
Chen, Hong [2 ,3 ]
Lian, Chuanqiang [4 ]
Li, Dazi [5 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci, Changsha 410073, Hunan, Peoples R China
[2] Jilin Univ NanLing, State Key Lab Automot Simulat & Control, Changchun 130025, Jilin, Peoples R China
[3] Jilin Univ NanLing, Dept Control Sci & Engn, Changchun 130025, Jilin, Peoples R China
[4] Naval Univ Engn, Natl Key Lab Sci & Technol Vessel Integrated Powe, Wuhan 430032, Hubei, Peoples R China
[5] Beijing Univ Chem Technol, Dept Automat, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming (ADP); function approximation; model predictive control (MPC); optimal control; receding horizon; reinforcement learning (RL); H-INFINITY CONTROL; POLICY ITERATION; CONTROL SCHEME;
D O I
10.1109/TNNLS.2018.2820019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a learning-based predictive control (LPC) scheme is proposed for adaptive optimal control of discrete-time nonlinear systems under stochastic disturbances. The proposed LPC scheme is different from conventional model predictive control (MPC), which uses open-loop optimization or simplified closed-loop optimal control techniques in each horizon. In LPC, the control task in each horizon is formulated as a closed-loop nonlinear optimal control problem and a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions. Therefore, in LPC, RL and adaptive dynamic programming ( ADP) are used as a new class of closed-loop learning-based optimization techniques for nonlinear predictive control with stochastic disturbances. Moreover, LPC also decomposes the infinite-horizon optimal control problem in previous RL and ADP methods into a series of finite horizon problems, so that the computational costs are reduced and the learning efficiency can be improved. Convergence of the finite-horizon iterative RL algorithm in each prediction horizon and the Lyapunov stability of the closed-loop control system are proved. Moreover, by using successive policy updates between adjoint time horizons, LPC also has lower computational costs than conventional MPC which has independent optimization procedures between two different prediction horizons. Simulation results illustrate that compared with conventional nonlinear MPC as well as ADP, the proposed LPC scheme can obtain a better performance both in terms of policy optimality and computational efficiency.
引用
收藏
页码:6202 / 6213
页数:12
相关论文
共 50 条
  • [21] OPTIMAL PREDICTIVE CONTROL OF LINEAR DISCRETE-TIME STOCHASTIC SYSTEMS.
    Yahagi, Takashi
    Sakai, Katsuhito
    Electronics and Communications in Japan, Part I: Communications (English translation of Denshi Tsushin Gakkai Ronbunshi), 1985, 68 (12): : 27 - 36
  • [22] CONTROL OF LINEAR DISCRETE-TIME STOCHASTIC DYNAMIC-SYSTEMS WITH MULTIPLICATIVE DISTURBANCES
    AOKI, M
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1975, AC20 (03) : 388 - 392
  • [23] Stability of Nonlinear Stochastic Discrete-Time Systems
    Li, Yan
    Zhang, Weihai
    Liu, Xikui
    JOURNAL OF APPLIED MATHEMATICS, 2013,
  • [24] DECOMPOSITION OF NONLINEAR DISCRETE-TIME STOCHASTIC SYSTEMS
    韩崇昭
    Acta Mathematica Scientia, 1985, (04) : 399 - 413
  • [25] Losslessness of Nonlinear Stochastic Discrete-Time Systems
    Liu, Xikui
    Li, Yan
    Gao, Ning
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
  • [26] Robust discrete-time set-based adaptive predictive control for nonlinear systems
    Goncalves, Guilherme A. A.
    Guay, Martin
    JOURNAL OF PROCESS CONTROL, 2016, 39 : 111 - 122
  • [27] Aperiodically Intermittent Control of Switched Stochastic Nonlinear Systems Based on Discrete-Time Observation
    Sun, Yuanyuan
    Deng, Feiqi
    Yu, Peilin
    Huang, Yongjia
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (01) : 345 - 349
  • [28] DISCRETE-TIME ITERATIVE LEARNING CONTROL FOR NONLINEAR SYSTEMS BASED ON FEEDBACK LINEARIZATION
    Song, Bing
    Phan, Minh Q.
    Longman, Richard W.
    SPACEFLIGHT MECHANICS 2019, VOL 168, PTS I-IV, 2019, 168 : 1603 - 1616
  • [29] Tube model predictive control for a class of nonlinear discrete-time systems
    Marrani, Hashem Imani
    Fazeli, Samane
    Malekizade, Hamid
    Hosseinzadeh, Hasan
    COGENT ENGINEERING, 2019, 6 (01):
  • [30] Robust fuzzy model predictive control for nonlinear discrete-time systems
    Mahmoudabadi, Parvin
    Naderi Akhormeh, Alireza
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024, 38 (03) : 938 - 953