Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

被引:539
|
作者
Liu, Derong [1 ]
Wei, Qinglai [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; discrete-time policy iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; reinforcement learning; NETWORKED CONTROL-SYSTEM; OPTIMAL TRACKING CONTROL; ONLINE LEARNING CONTROL; CONTROL SCHEME; FEEDBACK-CONTROL; CRITIC DESIGNS; REINFORCEMENT; APPROXIMATION; ARCHITECTURE;
D O I
10.1109/TNNLS.2013.2281663
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.
引用
收藏
页码:621 / 634
页数:14
相关论文
共 50 条
  • [31] An Event-Triggered Heuristic Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
    Wang, Ziyang
    Wei, Qinglai
    Liu, Derong
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 741 - 748
  • [32] Parallel Cross Entropy Policy Gradient Adaptive Dynamic Programming for Optimal Tracking Control of Discrete-Time Nonlinear Systems
    Xu, Jiahui
    Wang, Jingcheng
    Rao, Jun
    Zhong, Yanjiu
    Wu, Shunyu
    Sun, Qifang
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (06): : 3809 - 3821
  • [33] Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems With Trajectory-Based Initial Control Policy
    Xu, Jiahui
    Wang, Jingcheng
    Rao, Jun
    Wu, Shunyu
    Zhong, Yanjiu
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (03): : 1489 - 1501
  • [34] Generalized Policy Iteration Adaptive Dynamic Programming Algorithm for Optimal Tracking Control of a Class of Nonlinear Systems
    Lin, Qiao
    Wei, Qinglai
    Liu, Derong
    [J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 5009 - 5014
  • [35] Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis
    Wei, Qinglai
    Liu, Derong
    Lin, Qiao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (11) : 2490 - 2502
  • [36] Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
    Wei, Qinglai
    Wang, Ding
    Zhang, Dehua
    [J]. NEURAL COMPUTING & APPLICATIONS, 2013, 23 (7-8): : 1851 - 1863
  • [37] Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
    Qinglai Wei
    Ding Wang
    Dehua Zhang
    [J]. Neural Computing and Applications, 2013, 23 : 1851 - 1863
  • [38] Discrete-Time ε-Adaptive Dynamic Programming Algorithm Using Neural Networks
    Jin, Ning
    Liu, Derong
    [J]. PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL, 2008, : 114 - 119
  • [39] Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems
    Guangyu Zhu
    Xiaolu Li
    Ranran Sun
    Yiyuan Yang
    Peng Zhang
    [J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10 (03) : 781 - 791
  • [40] Policy Iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems
    Zhu, Guangyu
    Li, Xiaolu
    Sun, Ranran
    Yang, Yiyuan
    Zhang, Peng
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (03) : 781 - 791