Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm

被引:30
|
作者
Song, Shijie [1 ]
Zhu, Minglei [1 ]
Dai, Xiaolin [1 ]
Gong, Dawei [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China
基金
芬兰科学院;
关键词
Heuristic algorithms; Q-learning; Nonlinear dynamical systems; Approximation algorithms; Iterative algorithms; Convergence; Artificial neural networks; Adaptive dynamic programming (ADP); neural network (NN); off-policy technique; optimal tracking control (OTC); CONTROL SCHEME; LINEAR-SYSTEMS;
D O I
10.1109/TNNLS.2022.3178746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, a novel model-free dynamic inversion-based Q-learning (DIQL) algorithm is proposed to solve the optimal tracking control (OTC) problem of unknown nonlinear input-affine discrete-time (DT) systems. Compared with the existing DIQL algorithm and the discount factor-based Q-learning (DFQL) algorithm, the proposed algorithm can eliminate the tracking error while ensuring that it is model-free and off-policy. First, a new deterministic Q-learning iterative scheme is presented, and based on this scheme, a model-based off-policy DIQL algorithm is designed. The advantage of this new scheme is that it can avoid the training of unusual data and improve data utilization, thereby saving computing resources. Simultaneously, the convergence and stability of the designed algorithm are analyzed, and the proof that adding probing noise into the behavior policy does not affect the convergence is presented. Then, by introducing neural networks (NNs), the model-free version of the designed algorithm is further proposed so that the OTC problem can be solved without any knowledge about the system dynamics. Finally, three simulation examples are given to demonstrate the effectiveness of the proposed algorithm.
引用
收藏
页码:999 / 1012
页数:14
相关论文
共 50 条
  • [21] Optimal Iterative Learning Control for Nonlinear Discrete-time Systems
    Xu Hong-wei
    KEY ENGINEERING MATERIALS AND COMPUTER SCIENCE, 2011, 320 : 605 - 609
  • [22] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
  • [23] Iterative Q-Learning for Model-Free Optimal Control With Adjustable Convergence Rate
    Wang, Ding
    Wang, Yuan
    Zhao, Mingming
    Qiao, Junfei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (04) : 2224 - 2228
  • [24] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics
    Kiumarsi, Bahare
    Lewis, Frank L.
    Modares, Hamidreza
    Karimpour, Ali
    Naghibi-Sistani, Mohammad-Bagher
    AUTOMATICA, 2014, 50 (04) : 1167 - 1175
  • [25] Reinforcement Q-learning algorithm for H∞ tracking control of discrete-time Markov jump systems
    Shi, Jiahui
    He, Dakuo
    Zhang, Qiang
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024,
  • [26] Reinforcement Q-Learning Algorithm for H∞ Tracking Control of Unknown Discrete-Time Linear Systems
    Peng, Yunjian
    Chen, Qian
    Sun, Weijie
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4109 - 4122
  • [27] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    Wei QingLai
    Liu DeRong
    SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
  • [28] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    WEI QingLai
    LIU DeRong
    Science China(Information Sciences), 2015, 58 (12) : 147 - 161
  • [29] An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics
    Mu, Chaoxu
    Zhao, Qian
    Sun, Changyin
    Gao, Zhongke
    APPLIED SOFT COMPUTING, 2019, 82
  • [30] Optimal Control of Affine Nonlinear Discrete-time Systems
    Dierks, Travis
    Jagannthan, S.
    MED: 2009 17TH MEDITERRANEAN CONFERENCE ON CONTROL & AUTOMATION, VOLS 1-3, 2009, : 1390 - 1395