Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

被引:539
|
作者
Liu, Derong [1 ]
Wei, Qinglai [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; discrete-time policy iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; reinforcement learning; NETWORKED CONTROL-SYSTEM; OPTIMAL TRACKING CONTROL; ONLINE LEARNING CONTROL; CONTROL SCHEME; FEEDBACK-CONTROL; CRITIC DESIGNS; REINFORCEMENT; APPROXIMATION; ARCHITECTURE;
D O I
10.1109/TNNLS.2013.2281663
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.
引用
收藏
页码:621 / 634
页数:14
相关论文
共 50 条
  • [1] Generalized Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
    Liu, Derong
    Wei, Qinglai
    Yan, Pengfei
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2015, 45 (12): : 1577 - 1591
  • [2] Local Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
    Wei, Qinglai
    Xu, Yancai
    Lin, Qiao
    Liu, Derong
    Song, Ruizhuo
    [J]. ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 : 148 - 153
  • [3] Adaptive Dynamic Programming with Stable Value Iteration Algorithm for Discrete-Time Nonlinear Systems
    Wei, Qinglai
    Liu, Derong
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [4] A Generalized Policy Iteration Adaptive Dynamic Programming Algorithm for Optimal Control of Discrete-Time Nonlinear Systems with Actuator Saturation
    Lin, Qiao
    Wei, Qinglai
    Zhao, Bo
    [J]. ADVANCES IN NEURAL NETWORKS, PT II, 2017, 10262 : 60 - 65
  • [5] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
    Guo, Wentao
    Si, Jennie
    Liu, Feng
    Mei, Shengwei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
  • [6] Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
    Wei, Qinglai
    Liu, Derong
    Lin, Hanquan
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (03) : 840 - 853
  • [7] Modified λ-Policy Iteration Based Adaptive Dynamic Programming for Unknown Discrete-Time Linear Systems
    Jiang, Huaiyuan
    Zhou, Bin
    Duan, Guang-Ren
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3291 - 3301
  • [8] Optimal Learning Control for Discrete-Time Nonlinear Systems Using Generalized Policy Iteration Based Adaptive Dynamic Programming
    Wei, Qinglai
    Liu, Derong
    [J]. 2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1781 - 1786
  • [9] A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm
    Lin, Qiao
    Wei, Qinglai
    Liu, Derong
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2017, 48 (03) : 525 - 534
  • [10] Modified general policy iteration based adaptive dynamic programming for unknown discrete-time linear systems
    Jiang, Huaiyuan
    Zhou, Bin
    Duan, Guang-Ren
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (12) : 7149 - 7173