Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

被引:18
|
作者
Zhang, Li [1 ,2 ]
Fan, Jialu [1 ,2 ]
Xue, Wenqian [1 ,2 ]
Lopez, Victor G. [3 ]
Li, Jinna [4 ]
Chai, Tianyou [1 ,2 ]
Lewis, Frank L. [5 ]
机构
[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang 110819, Peoples R China
[3] Leibniz Univ Hannover, D-30167 Hannover, Germany
[4] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
[5] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA
关键词
Heuristic algorithms; Optimal control; Transmission line matrix methods; Process control; Performance analysis; Output feedback; Games; H∞ control; off-policy Q-learning; Q-learning; static output feedback (OPFB); zero-sum game; H-INFINITY CONTROL; ZERO-SUM GAMES; ADAPTIVE OPTIMAL-CONTROL; ALGORITHM; DESIGN;
D O I
10.1109/TNNLS.2021.3112457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve $H_{infinity}$ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.
引用
下载
收藏
页码:3553 / 3567
页数:15
相关论文
共 50 条
  • [1] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
    Zhang, Kun
    Liu, Xuantong
    Zhang, Lei
    Chen, Qian
    Peng, Yunjian
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
  • [2] Output Feedback H∞ Control of Unknown Discrete-time Linear Systems: Off-policy Reinforcement Learning
    Tooranjipour, Pouria
    Kiumarsi, Bahare
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2264 - 2269
  • [3] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
    Li, Jinna
    Chai, Tianyou
    Lewis, Frank L.
    Ding, Zhengtao
    Jiang, Yi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
  • [4] Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning
    Xiao, Zhenfei
    Li, Jinna
    Li, Ping
    IEEE ACCESS, 2020, 8 : 208938 - 208951
  • [5] Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning
    Yang, Yongliang
    Guo, Zhishan
    Xiong, Haoyi
    Ding, Da-Wei
    Yin, Yixin
    Wunsch, Donald C.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3735 - 3747
  • [6] H∞ Control for Discrete-time Linear Systems by Integrating Off-policy Q-learning and Zero-sum Game
    Li, Jinna
    Ding, Zhengtao
    Yang, Chunyu
    Niu, Hong
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2018, : 817 - 822
  • [7] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [8] H∞ control of linear discrete-time systems: Off-policy reinforcement learning
    Kiumarsi, Bahare
    Lewis, Frank L.
    Jiang, Zhong-Ping
    AUTOMATICA, 2017, 78 : 144 - 152
  • [9] Seeking Nash Equilibrium for Linear Discrete-time Systems via Off-policy Q-learning
    Ni, Haohan
    Ji, Yuxiang
    Yang, Yuxiao
    Zhou, Jianping
    IAENG International Journal of Applied Mathematics, 2024, 54 (11) : 2477 - 2483
  • [10] Optimal tracking control for discrete-time systems by model-free off-policy Q-learning approach
    Li, Jinna
    Yuan, Decheng
    Ding, Zhengtao
    2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 7 - 12