Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

被引:0
|
作者
Cassel, Asaf [1 ]
Cohen, Alon [2 ]
Koren, Tomer [1 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, Tel Aviv, Israel
[2] Google Res, Tel Aviv, Israel
关键词
ADAPTIVE-CONTROL; IDENTIFICATION; PARAMETER;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix A is unknown, and when only the state-action transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems
    Lale, Sahin
    Azizzadenesheli, Kamyar
    Hassibi, Babak
    Anandkumar, Anima
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [22] Low-complexity learning of Linear Quadratic Regulators from noisy data
    De Persis, Claudio
    Tesi, Pietro
    AUTOMATICA, 2021, 128
  • [23] Regret bounds for online-learning-based linear quadratic control under database attacks
    Chekan, Jafar Abbaszadeh
    Langbort, Cedric
    AUTOMATICA, 2023, 151
  • [24] Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
    Anandkumar, Animashree
    Michael, Nithin
    Tang, Kevin
    Swami, Ananthram
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2011, 29 (04) : 731 - 745
  • [25] No-Regret Learning with Unbounded Losses: The Case of Logarithmic Pooling
    Neyman, Eric
    Roughgarden, Tim
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] A CLASS OF LINEAR-QUADRATIC DISCRETE REGULATORS
    MAHMOUD, MS
    BAHNASAWI, AA
    JOURNAL OF THE UNIVERSITY OF KUWAIT-SCIENCE, 1993, 20 (02): : 227 - 235
  • [28] Time-changed Linear Quadratic Regulators
    Lamperski, Andrew
    Cowan, Noah J.
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 198 - 203
  • [29] The composition synthesis of linear-quadratic regulators
    Dubovik, S.A.
    Problemy Upravleniya I Informatiki (Avtomatika), 1999, (02): : 50 - 62
  • [30] The composition synthesis of linear-quadratic regulators
    Dubovik, Sergey A.
    Journal of Automation and Information Sciences, 1999, 31 (7-9): : 33 - 42