Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

被引：539

作者：

Liu, Derong ^{[1
]}

Wei, Qinglai ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2014年 / 25卷 / 03期

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; discrete-time policy iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; reinforcement learning; NETWORKED CONTROL-SYSTEM; OPTIMAL TRACKING CONTROL; ONLINE LEARNING CONTROL; CONTROL SCHEME; FEEDBACK-CONTROL; CRITIC DESIGNS; REINFORCEMENT; APPROXIMATION; ARCHITECTURE;

D O I：

10.1109/TNNLS.2013.2281663

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.

引用

页码：621 / 634

页数：14

共 50 条

[31] An Event-Triggered Heuristic Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
Wang, Ziyang
Wei, Qinglai
Liu, Derong
[J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 741 - 748
[32] Parallel Cross Entropy Policy Gradient Adaptive Dynamic Programming for Optimal Tracking Control of Discrete-Time Nonlinear Systems
Xu, Jiahui
Wang, Jingcheng
Rao, Jun
Zhong, Yanjiu
Wu, Shunyu
Sun, Qifang
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (06): : 3809 - 3821
[33] Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems With Trajectory-Based Initial Control Policy
Xu, Jiahui
Wang, Jingcheng
Rao, Jun
Wu, Shunyu
Zhong, Yanjiu
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (03): : 1489 - 1501
[34] Generalized Policy Iteration Adaptive Dynamic Programming Algorithm for Optimal Tracking Control of a Class of Nonlinear Systems
Lin, Qiao
Wei, Qinglai
Liu, Derong
[J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 5009 - 5014
[35] Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis
Wei, Qinglai
Liu, Derong
Lin, Qiao
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (11) : 2490 - 2502
[36] Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
Wei, Qinglai
Wang, Ding
Zhang, Dehua
[J]. NEURAL COMPUTING & APPLICATIONS, 2013, 23 (7-8): : 1851 - 1863
[37] Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
Qinglai Wei
Ding Wang
Dehua Zhang
[J]. Neural Computing and Applications, 2013, 23 : 1851 - 1863
[38] Discrete-Time ε-Adaptive Dynamic Programming Algorithm Using Neural Networks
Jin, Ning
Liu, Derong
[J]. PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL, 2008, : 114 - 119
[39] Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems
Guangyu Zhu
Xiaolu Li
Ranran Sun
Yiyuan Yang
Peng Zhang
[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10 (03) : 781 - 791
[40] Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems
Zhu, Guangyu
Li, Xiaolu
Sun, Ranran
Yang, Yiyuan
Zhang, Peng
[J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (03) : 781 - 791

← 1 2 3 4 5 →