Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

被引:0
|
作者
Cassel, Asaf [1 ]
Cohen, Alon [2 ]
Koren, Tomer [1 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, Tel Aviv, Israel
[2] Google Res, Tel Aviv, Israel
关键词
ADAPTIVE-CONTROL; IDENTIFICATION; PARAMETER;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix A is unknown, and when only the state-action transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
    Cassel, Asaf
    Cohen, Alon
    Koren, Tomer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [2] Learning Linear-Quadratic Regulators Efficiently with only √T Regret
    Cohen, Alon
    Koren, Tomer
    Mansour, Yishay
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [3] Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret
    Cassel, Asaf
    Koren, Tomer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Logarithmic regret in online linear quadratic control using Riccati updates
    Mohammad Akbari
    Bahman Gharesifard
    Tamas Linder
    Mathematics of Control, Signals, and Systems, 2022, 34 : 647 - 678
  • [5] Logarithmic regret in online linear quadratic control using Riccati updates
    Akbari, Mohammad
    Gharesifard, Bahman
    Linder, Tamas
    MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 2022, 34 (03) : 647 - 678
  • [6] Logarithmic Regret for Reinforcement Learning with Linear Function Approximation
    He, Jiafan
    Zhou, Dongruo
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Regret Lower Bounds for Unbiased Adaptive Control of Linear Quadratic Regulators
    Ziemann, Ingvar
    Sandberg, Henrik
    IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (03): : 785 - 790
  • [8] Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon
    Basei, Matteo
    Guo, Xin
    Hu, Anran
    Zhang, Yufei
    Journal of Machine Learning Research, 2022, 23
  • [9] Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon
    Basei, Matteo
    Guo, Xin
    Hu, Anran
    Zhang, Yufei
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [10] The Fundamental Limitations of Learning Linear-Quadratic Regulators
    Lee, Bruce D.
    Ziemann, Ingvar
    Tsiamis, Anastasios
    Sandberg, Henrik
    Matni, Nikolai
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4053 - 4060