Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

被引:18
|
作者
Zhang, Li [1 ,2 ]
Fan, Jialu [1 ,2 ]
Xue, Wenqian [1 ,2 ]
Lopez, Victor G. [3 ]
Li, Jinna [4 ]
Chai, Tianyou [1 ,2 ]
Lewis, Frank L. [5 ]
机构
[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang 110819, Peoples R China
[3] Leibniz Univ Hannover, D-30167 Hannover, Germany
[4] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
[5] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA
关键词
Heuristic algorithms; Optimal control; Transmission line matrix methods; Process control; Performance analysis; Output feedback; Games; H∞ control; off-policy Q-learning; Q-learning; static output feedback (OPFB); zero-sum game; H-INFINITY CONTROL; ZERO-SUM GAMES; ADAPTIVE OPTIMAL-CONTROL; ALGORITHM; DESIGN;
D O I
10.1109/TNNLS.2021.3112457
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve $H_{infinity}$ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.
引用
下载
收藏
页码:3553 / 3567
页数:15
相关论文
共 50 条
  • [21] The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach
    Dong, Xunde
    Lin, Yuxin
    Suo, Xudong
    Wang, Xihao
    Sun, Weijie
    MATHEMATICS, 2024, 12 (04)
  • [22] Output Feedback Reinforcement Q-learning for Optimal Quadratic Tracking Control of Unknown Discrete-Time Linear Systems and Its Application
    Zhao, Guangyue
    Sun, Weijie
    Cai, He
    Peng, Yunjian
    2018 15TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2018, : 750 - 755
  • [23] Off-policy Reinforcement Learning for Robust Control of Discrete-time Uncertain Linear Systems
    Yang, Yongliang
    Guo, Zhishan
    Wunsch, Donald
    Yin, Yixin
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2507 - 2512
  • [24] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
    Wang, Yun
    Fang, Tian
    Kong, Qingkai
    Li, Feng
    APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
  • [25] Data-driven Optimal Preview Output Tracking of Linear Discrete-time Systems
    Liu, Zhou-Yang
    Wu, Huai-Ning
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 1973 - 1978
  • [26] Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning
    Yang, Yongliang
    Zhang, Sen
    Dong, Jie
    Yin, Yixin
    IEEE ACCESS, 2020, 8 : 14074 - 14088
  • [27] Off-Policy Reinforcement Learning for Optimal Preview Tracking Control of Linear Discrete-Time systems with unknown dynamics
    Wang, Chao-Ran
    Wu, Huai-Ning
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 1402 - 1407
  • [28] Output-feedback Q-learning for discrete-time linear H∞ tracking control: A Stackelberg game approach
    Ren, Yunxiao
    Wang, Qishao
    Duan, Zhisheng
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (12) : 6805 - 6828
  • [29] Minimax Q-learning design for H∞ control of linear discrete-time systems
    Li, Xinxing
    Xi, Lele
    Zha, Wenzhong
    Peng, Zhihong
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (03) : 438 - 451
  • [30] Output Feedback Reinforcement Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
    Rizvi, Syed Ali Asad
    Lin, Zongli
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,