Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

被引:18
|
作者
Zhang, Li [1 ,2 ]
Fan, Jialu [1 ,2 ]
Xue, Wenqian [1 ,2 ]
Lopez, Victor G. [3 ]
Li, Jinna [4 ]
Chai, Tianyou [1 ,2 ]
Lewis, Frank L. [5 ]
机构
[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang 110819, Peoples R China
[3] Leibniz Univ Hannover, D-30167 Hannover, Germany
[4] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
[5] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA
关键词
Heuristic algorithms; Optimal control; Transmission line matrix methods; Process control; Performance analysis; Output feedback; Games; H∞ control; off-policy Q-learning; Q-learning; static output feedback (OPFB); zero-sum game; H-INFINITY CONTROL; ZERO-SUM GAMES; ADAPTIVE OPTIMAL-CONTROL; ALGORITHM; DESIGN;
D O I
10.1109/TNNLS.2021.3112457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve $H_{infinity}$ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.
引用
下载
收藏
页码:3553 / 3567
页数:15
相关论文
共 50 条
  • [41] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
  • [42] Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm
    Tan, Xufeng
    Li, Yuan
    Liu, Yang
    AIMS MATHEMATICS, 2023, 8 (05): : 10249 - 10265
  • [43] Data-driven adaptive optimal control for discrete-time linear time-invariant systems
    Wu, Ai-Guo
    Meng, Yuan
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024, : 3069 - 3082
  • [44] Quantized H∞ output feedback control for linear discrete-time systems
    Lu, Renquan
    Zhou, Xingxing
    Wu, Fang
    Xue, Anke
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2013, 350 (08): : 2096 - 2108
  • [45] Data-driven tracking control approach for linear systems by on-policy Q-learning approach
    Zhang Yihan
    Mao Zhenfei
    Li Jinna
    16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 1066 - 1070
  • [46] Adaptive optimal output feedback tracking control for unknown discrete-time linear systems using a combined reinforcement Q-learning and internal model method
    Sun, Weijie
    Zhao, Guangyue
    Peng, Yunjian
    IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (18): : 3075 - 3086
  • [47] H∞ Tracking learning control for discrete-time Markov jump systems: A parallel off-policy reinforcement learning
    Zhang, Xuewen
    Xia, Jianwei
    Wang, Jing
    Chen, Xiangyong
    Shen, Hao
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2023, 360 (18): : 14878 - 14890
  • [48] Output Feedback Q-Learning for Linear-Quadratic Discrete-Time Finite-Horizon Control Problems
    Calafiore, Giuseppe C.
    Possieri, Corrado
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) : 3274 - 3281
  • [49] Reinforcement Q-Learning Algorithm for H∞ Tracking Control of Unknown Discrete-Time Linear Systems
    Peng, Yunjian
    Chen, Qian
    Sun, Weijie
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4109 - 4122
  • [50] Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems
    Skach, Jan
    Kiumarsi, Bahare
    Lewis, Frank L.
    Straka, Ondrej
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (01) : 29 - 40