Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm

被引：30

作者：

Song, Shijie ^{[1
]}

Zhu, Minglei ^{[1
]}

Dai, Xiaolin ^{[1
]}

Gong, Dawei ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 01期

基金：

芬兰科学院;

关键词：

Heuristic algorithms; Q-learning; Nonlinear dynamical systems; Approximation algorithms; Iterative algorithms; Convergence; Artificial neural networks; Adaptive dynamic programming (ADP); neural network (NN); off-policy technique; optimal tracking control (OTC); CONTROL SCHEME; LINEAR-SYSTEMS;

D O I：

10.1109/TNNLS.2022.3178746

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, a novel model-free dynamic inversion-based Q-learning (DIQL) algorithm is proposed to solve the optimal tracking control (OTC) problem of unknown nonlinear input-affine discrete-time (DT) systems. Compared with the existing DIQL algorithm and the discount factor-based Q-learning (DFQL) algorithm, the proposed algorithm can eliminate the tracking error while ensuring that it is model-free and off-policy. First, a new deterministic Q-learning iterative scheme is presented, and based on this scheme, a model-based off-policy DIQL algorithm is designed. The advantage of this new scheme is that it can avoid the training of unusual data and improve data utilization, thereby saving computing resources. Simultaneously, the convergence and stability of the designed algorithm are analyzed, and the proof that adding probing noise into the behavior policy does not affect the convergence is presented. Then, by introducing neural networks (NNs), the model-free version of the designed algorithm is further proposed so that the OTC problem can be solved without any knowledge about the system dynamics. Finally, three simulation examples are given to demonstrate the effectiveness of the proposed algorithm.

引用

页码：999 / 1012

页数：14

共 50 条

[1] Optimal tracking control for discrete-time systems by model-free off-policy Q-learning approach
Li, Jinna
Yuan, Decheng
Ding, Zhengtao
2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 7 - 12
[2] Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems
Li, Chun
Ding, Jinliang
Lewis, Frank L.
Chai, Tianyou
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3191 - 3201
[3] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
Yang, Yunjie
Wan, Yan
Zhu, Jihong
Lewis, Frank L.
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
[4] Stochastic linear quadratic optimal control for model-free discrete-time systems based on Q-learning algorithm
Wang, Tao
Zhang, Huaguang
Luo, Yanhong
NEUROCOMPUTING, 2018, 312 : 1 - 8
[5] Model-free optimal tracking control for discrete-time system with delays using reinforcement Q-learning
Liu, Yang
Yu, Rui
ELECTRONICS LETTERS, 2018, 54 (12) : 750 - 751
[6] Adjustable Iterative Q-Learning Schemes for Model-Free Optimal Tracking Control
Qiao, Junfei
Zhao, Mingming
Wang, Ding
Ha, Mingming
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (02): : 1202 - 1213
[7] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
Ding, Zhengtao
Jiang, Yi
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
[8] Policy Optimization Adaptive Dynamic Programming for Optimal Control of Input-Affine Discrete-Time Nonlinear Systems
Lin, Mingduo
Zhao, Bo
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (07): : 4339 - 4350
[9] Model-Free Learning Control of Nonlinear Discrete-Time Systems
Sadegh, Nader
2011 AMERICAN CONTROL CONFERENCE, 2011, : 3553 - 3558
[10] Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI
Kim, J. -H.
Lewis, F. L.
AUTOMATICA, 2010, 46 (08) : 1320 - 1326

← 1 2 3 4 5 →