Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

被引：539

作者：

Liu, Derong ^{[1
]}

Wei, Qinglai ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2014年 / 25卷 / 03期

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; discrete-time policy iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; reinforcement learning; NETWORKED CONTROL-SYSTEM; OPTIMAL TRACKING CONTROL; ONLINE LEARNING CONTROL; CONTROL SCHEME; FEEDBACK-CONTROL; CRITIC DESIGNS; REINFORCEMENT; APPROXIMATION; ARCHITECTURE;

D O I：

10.1109/TNNLS.2013.2281663

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.

引用

页码：621 / 634

页数：14

共 50 条

[1] Generalized Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
Liu, Derong
Wei, Qinglai
Yan, Pengfei
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2015, 45 (12): : 1577 - 1591
[2] Local Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
Wei, Qinglai
Xu, Yancai
Lin, Qiao
Liu, Derong
Song, Ruizhuo
[J]. ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 : 148 - 153
[3] Adaptive Dynamic Programming with Stable Value Iteration Algorithm for Discrete-Time Nonlinear Systems
Wei, Qinglai
Liu, Derong
[J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[4] A Generalized Policy Iteration Adaptive Dynamic Programming Algorithm for Optimal Control of Discrete-Time Nonlinear Systems with Actuator Saturation
Lin, Qiao
Wei, Qinglai
Zhao, Bo
[J]. ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 : 60 - 65
[5] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
Guo, Wentao
Si, Jennie
Liu, Feng
Mei, Shengwei
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
[6] Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
Wei, Qinglai
Liu, Derong
Lin, Hanquan
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (03) : 840 - 853
[7] Modified λ-Policy Iteration Based Adaptive Dynamic Programming for Unknown Discrete-Time Linear Systems
Jiang, Huaiyuan
Zhou, Bin
Duan, Guang-Ren
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3291 - 3301
[8] Optimal Learning Control for Discrete-Time Nonlinear Systems Using Generalized Policy Iteration Based Adaptive Dynamic Programming
Wei, Qinglai
Liu, Derong
[J]. 2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1781 - 1786
[9] A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm
Lin, Qiao
Wei, Qinglai
Liu, Derong
[J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2017, 48 (03) : 525 - 534
[10] Modified general policy iteration based adaptive dynamic programming for unknown discrete-time linear systems
Jiang, Huaiyuan
Zhou, Bin
Duan, Guang-Ren
[J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (12) : 7149 - 7173

← 1 2 3 4 5 →