Novel Discounted Adaptive Critic Control Designs With Accelerated Learning Formulation

被引:14
|
作者
Ha, Mingming [1 ,2 ]
Wang, Ding [3 ]
Liu, Derong [4 ,5 ]
机构
[1] Ant Grp, MYbank, Beijing 100020, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automation & Elect Engn, Beijing 100083, Peoples R China
[3] Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
[4] Southern Univ Sci & Technol, Sch Syst Design & Intelligent Mfg, Shenzhen 518055, Peoples R China
[5] Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Iterative methods; Convergence; Power system stability; Optimal control; Stability criteria; Cost function; Closed loop systems; Adaptive critic designs; adaptive dynamic programming (ADP); discrete-time nonlinear systems; fast convergence rate; reinforcement learning; value iteration (VI); STABILITY ANALYSIS; VALUE-ITERATION; SUBJECT;
D O I
10.1109/TCYB.2022.3233593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Inspired by the successive relaxation method, a novel discounted iterative adaptive dynamic programming framework is developed, in which the iterative value function sequence possesses an adjustable convergence rate. The different convergence properties of the value function sequence and the stability of the closed-loop systems under the new discounted value iteration (VI) are investigated. Based on the properties of the given VI scheme, an accelerated learning algorithm with convergence guarantee is presented. Moreover, the implementations of the new VI scheme and its accelerated learning design are elaborated, which involve value function approximation and policy improvement. A nonlinear fourth-order ball-and-beam balancing plant is used to verify the performance of the developed approaches. Compared with the traditional VI, the present discounted iterative adaptive critic designs greatly accelerate the convergence rate of the value function and reduce the computational cost simultaneously.
引用
收藏
页码:3003 / 3016
页数:14
相关论文
共 50 条
  • [31] Optimal H. tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs
    Peng, Zhinan
    Ji, Hanqi
    Zou, Chaobin
    Kuang, Yiqun
    Cheng, Hong
    Shi, Kaibo
    Ghosh, Bijoy Kumar
    NEURAL NETWORKS, 2023, 164 : 105 - 114
  • [32] ON EXPONENTIALLY DISCOUNTED ADAPTIVE-CONTROL
    DUNCAN, TE
    MANDL, P
    PASIKDUNCAN, B
    KYBERNETIKA, 1990, 26 (05) : 361 - 372
  • [33] Learning-based adaptive control with an accelerated iterative adaptive law
    Shi, Zhongjiao
    Zhao, Liangyu
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2020, 357 (10): : 5831 - 5851
  • [34] An Accelerated Adaptive Gain Design in Stochastic Learning Control
    Cheng, Xiang
    Jiang, Hao
    Shen, Dong
    Yu, Xinghuo
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, : 7416 - 7429
  • [35] Research Progress on Learning-based Robust Adaptive Critic Control
    Wang D.
    Zidonghua Xuebao/Acta Automatica Sinica, 2019, 45 (06): : 1031 - 1043
  • [36] CMAC-BASED ADAPTIVE CRITIC SELF-LEARNING CONTROL
    LIN, CS
    KIM, H
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (05): : 530 - 533
  • [37] Adaptive Learning in Tracking Control Based on the Dual Critic Network Design
    Ni, Zhen
    He, Haibo
    Wen, Jinyu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (06) : 913 - 928
  • [38] Parallel Adaptive Critic Designs of Optimal Control for Ice-Storage Air Conditioning Systems
    Liao, Zehua
    Wei, Qinglai
    Song, Ruizhuo
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 37 - 42
  • [39] Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics
    Wang, Ding
    Liu, Derong
    Zhang, Qichao
    Zhao, Dongbin
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2016, 46 (11): : 1544 - 1555
  • [40] Adaptive Critic Designs for Event-Triggered Robust Control of Nonlinear Systems With Unknown Dynamics
    Yang, Xiong
    He, Haibo
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (06) : 2255 - 2267