Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances

被引：55

作者：

Xu, Xin ^{[1
]}

Chen, Hong ^{[2
,3
]}

Lian, Chuanqiang ^{[4
]}

Li, Dazi ^{[5
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci, Changsha 410073, Hunan, Peoples R China

[2] Jilin Univ NanLing, State Key Lab Automot Simulat & Control, Changchun 130025, Jilin, Peoples R China

[3] Jilin Univ NanLing, Dept Control Sci & Engn, Changchun 130025, Jilin, Peoples R China

[4] Naval Univ Engn, Natl Key Lab Sci & Technol Vessel Integrated Powe, Wuhan 430032, Hubei, Peoples R China

[5] Beijing Univ Chem Technol, Dept Automat, Beijing 100029, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2018年 / 29卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Adaptive dynamic programming (ADP); function approximation; model predictive control (MPC); optimal control; receding horizon; reinforcement learning (RL); H-INFINITY CONTROL; POLICY ITERATION; CONTROL SCHEME;

D O I：

10.1109/TNNLS.2018.2820019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a learning-based predictive control (LPC) scheme is proposed for adaptive optimal control of discrete-time nonlinear systems under stochastic disturbances. The proposed LPC scheme is different from conventional model predictive control (MPC), which uses open-loop optimization or simplified closed-loop optimal control techniques in each horizon. In LPC, the control task in each horizon is formulated as a closed-loop nonlinear optimal control problem and a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions. Therefore, in LPC, RL and adaptive dynamic programming ( ADP) are used as a new class of closed-loop learning-based optimization techniques for nonlinear predictive control with stochastic disturbances. Moreover, LPC also decomposes the infinite-horizon optimal control problem in previous RL and ADP methods into a series of finite horizon problems, so that the computational costs are reduced and the learning efficiency can be improved. Convergence of the finite-horizon iterative RL algorithm in each prediction horizon and the Lyapunov stability of the closed-loop control system are proved. Moreover, by using successive policy updates between adjoint time horizons, LPC also has lower computational costs than conventional MPC which has independent optimization procedures between two different prediction horizons. Simulation results illustrate that compared with conventional nonlinear MPC as well as ADP, the proposed LPC scheme can obtain a better performance both in terms of policy optimality and computational efficiency.

引用

页码：6202 / 6213

页数：12

共 50 条

[21] OPTIMAL PREDICTIVE CONTROL OF LINEAR DISCRETE-TIME STOCHASTIC SYSTEMS.
Yahagi, Takashi
Sakai, Katsuhito
Electronics and Communications in Japan, Part I: Communications (English translation of Denshi Tsushin Gakkai Ronbunshi), 1985, 68 (12): : 27 - 36
[22] CONTROL OF LINEAR DISCRETE-TIME STOCHASTIC DYNAMIC-SYSTEMS WITH MULTIPLICATIVE DISTURBANCES
AOKI, M
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1975, AC20 (03) : 388 - 392
[23] Stability of Nonlinear Stochastic Discrete-Time Systems
Li, Yan
Zhang, Weihai
Liu, Xikui
JOURNAL OF APPLIED MATHEMATICS, 2013,
[24] DECOMPOSITION OF NONLINEAR DISCRETE-TIME STOCHASTIC SYSTEMS
韩崇昭
Acta Mathematica Scientia, 1985, (04) : 399 - 413
[25] Losslessness of Nonlinear Stochastic Discrete-Time Systems
Liu, Xikui
Li, Yan
Gao, Ning
DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
[26] Robust discrete-time set-based adaptive predictive control for nonlinear systems
Goncalves, Guilherme A. A.
Guay, Martin
JOURNAL OF PROCESS CONTROL, 2016, 39 : 111 - 122
[27] Aperiodically Intermittent Control of Switched Stochastic Nonlinear Systems Based on Discrete-Time Observation
Sun, Yuanyuan
Deng, Feiqi
Yu, Peilin
Huang, Yongjia
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (01) : 345 - 349
[28] DISCRETE-TIME ITERATIVE LEARNING CONTROL FOR NONLINEAR SYSTEMS BASED ON FEEDBACK LINEARIZATION
Song, Bing
Phan, Minh Q.
Longman, Richard W.
SPACEFLIGHT MECHANICS 2019, VOL 168, PTS I-IV, 2019, 168 : 1603 - 1616
[29] Tube model predictive control for a class of nonlinear discrete-time systems
Marrani, Hashem Imani
Fazeli, Samane
Malekizade, Hamid
Hosseinzadeh, Hasan
COGENT ENGINEERING, 2019, 6 (01):
[30] Robust fuzzy model predictive control for nonlinear discrete-time systems
Mahmoudabadi, Parvin
Naderi Akhormeh, Alireza
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024, 38 (03) : 938 - 953

← 1 2 3 4 5 →