Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

被引:154
|
作者
Yang, Qinmin [1 ]
Jagannathan, Sarangapani [2 ]
机构
[1] Zhejiang Univ, Dept Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Zhejiang, Peoples R China
[2] Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Rolla, MO 65409 USA
关键词
Adaptive critic; dynamic programming (DP); Lyapunov method; neural networks (NNs); online approximators (OLAs); online learning; reinforcement learning;
D O I
10.1109/TSMCB.2011.2166384
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, reinforcement learning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown nonlinear discrete-time systems in the presence of bounded disturbances. The proposed controller design has two entities, an action network that is designed to produce optimal signal and a critic network that evaluates the performance of the action network. The critic estimates the cost-to-go function which is tuned online using recursive equations derived from heuristic dynamic programming. Here, neural networks (NNs) are used both for the action and critic whereas any OLAs, such as radial basis functions, splines, fuzzy logic, etc., can be utilized. For the output-feedback counterpart, an additional NN is designated as the observer to estimate the unavailable system states, and thus, separation principle is not required. The NN weight tuning laws for the controller schemes are also derived while ensuring uniform ultimate boundedness of the closed-loop system using Lyapunov theory. Finally, the effectiveness of the two controllers is tested in simulation on a pendulum balancing system and a two-link robotic arm system.
引用
收藏
页码:377 / 390
页数:14
相关论文
共 50 条
  • [21] Reinforcement Learning Controller Design for Discrete-Time-Constrained Nonlinear Systems With Weight Initialization Method
    Xu, Jiahui
    Wang, Jingcheng
    Zhong, Yanjiu
    Rao, Jun
    Wu, Shunyu
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (04): : 2368 - 2378
  • [22] Reinforcement Learning Policies With Local LQR Guarantees For Nonlinear Discrete-Time Systems
    Zoboli, Samuele
    Andrieu, Vincent
    Astolfi, Daniele
    Casadei, Giacomo
    Dibangoye, Jilles S.
    Nadri, Madiha
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2258 - 2263
  • [23] Off-policy safe reinforcement learning for nonlinear discrete-time systems
    Jha, Mayank Shekhar
    Kiumarsi, Bahare
    Neurocomputing, 2025, 611
  • [24] Adaptive NN Controller Design for a Class of Nonlinear MIMO Discrete-Time Systems
    Liu, Yan-Jun
    Tang, Li
    Tong, Shaocheng
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (05) : 1007 - 1018
  • [25] Online Adaptive Policy Learning Algorithm for H∞ State Feedback Control of Unknown Affine Nonlinear Discrete-Time Systems
    Zhang, Huaguang
    Qin, Chunbin
    Jiang, Bin
    Luo, Yanhong
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (12) : 2706 - 2718
  • [26] Online adaptive policy learning algorithm for H∞ state feedback control of unknown affine nonlinear discrete-time systems
    College of Information Science and Engineering, Northeastern University, Shenyang
    110004, China
    不详
    110004, China
    不详
    475004, China
    不详
    210016, China
    IEEE Trans. Cybern., 12 (2706-2718):
  • [27] Stability analysis and controller design for discrete-time periodic nonlinear quadratic systems
    Kang, Shugui
    Zhao, Xia
    Chen, Fu
    2016 IEEE CHINESE GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC), 2016, : 1259 - 1263
  • [28] Online optimal and adaptive integral tracking control for varying discrete-time systems using reinforcement learning
    Sanusi, Ibrahim
    Mills, Andrew
    Dodd, Tony
    Konstantopoulos, George
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2020, 34 (08) : 971 - 991
  • [29] Adaptive control of a class of discrete-time affine nonlinear systems
    Xie, LL
    Guo, L
    SYSTEMS & CONTROL LETTERS, 1998, 35 (03) : 201 - 206
  • [30] Adaptive control of a class of discrete-time affine nonlinear systems
    Xie, L.L.
    Guo, L.
    Systems and Control Letters, 1998, 35 (03): : 201 - 206