Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems

被引：73

作者：

Palanisamy, Muthukumar ^{[1
,2
]}

Modares, Hamidreza ^{[2
]}

Lewis, Frank L. ^{[2
]}

Aurangzeb, Muhammad ^{[2
]}

机构：

[1] Gandhigram Rural Inst Deemed Univ, Dept Math, Gandhigram 624302, India

[2] Univ Texas Arlington Res Inst, Ft Worth, TX 76118 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2015年 / 45卷 / 02期

基金：

美国国家科学基金会;

关键词：

Approximate dynamic programming (ADP); continuous-time dynamical systems; infinite-horizon discounted cost function; integral reinforcement learning (IRL); optimal control; Q-learning; value iteration (VI); ADAPTIVE OPTIMAL-CONTROL; ITERATION; SYSTEMS;

D O I：

10.1109/TCYB.2014.2322116

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.

引用

页码：165 / 176

页数：12

共 50 条

[1] Solution for the Continuous-Time Infinite-Horizon Linear Quadratic Regulator Subject to Scalar State Constraints
van Keulen, Thijs
IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (01): : 133 - 138
[2] Constrained Infinite-horizon Linear Quadratic Regulation of Continuous-time Systems
Gao Xiang-Yu
Zhang Xian
2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 1893 - 1898
[3] Infinite-horizon Risk-constrained Linear Quadratic Regulator with Average Cost
Zhao, Feiran
You, Keyou
Basar, Tamer
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 390 - 395
[4] Infinite-horizon continuous-time growth models
Kravvaritis, D
Papageorgiou, NS
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1996, 27 (04) : 373 - 378
[5] Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
Vamvoudakis, Kyriakos G.
SYSTEMS & CONTROL LETTERS, 2017, 100 : 14 - 20
[6] On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs
Chen, Yuanzhou
He, Jiafan
Gu, Quanquan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[7] Infinite-Horizon Continuous-Time NMPC via Time Transformation
Wuerth, Lynn
Marquardt, Wolfgang
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (09) : 2543 - 2548
[8] Addressing infinite-horizon optimization in MPC via Q-learning
Beckenbach, Lukas
Osinenko, Pavel
Streif, Stefan
IFAC PAPERSONLINE, 2018, 51 (20): : 60 - 65
[9] Infinite-horizon linear-quadratic regulator problems for nonautonomous parabolic systems with boundary control
Acquistapace, P
Terreni, B
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1996, 34 (01) : 1 - 30
[10] Safe Q-learning for continuous-time linear systems
Bandyopadhyay, Soutrik
Bhasin, Shubhendu
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 241 - 246

← 1 2 3 4 5 →