Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems

被引：73

作者：

Palanisamy, Muthukumar ^{[1
,2
]}

Modares, Hamidreza ^{[2
]}

Lewis, Frank L. ^{[2
]}

Aurangzeb, Muhammad ^{[2
]}

机构：

[1] Gandhigram Rural Inst Deemed Univ, Dept Math, Gandhigram 624302, India

[2] Univ Texas Arlington Res Inst, Ft Worth, TX 76118 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2015年 / 45卷 / 02期

基金：

美国国家科学基金会;

关键词：

Approximate dynamic programming (ADP); continuous-time dynamical systems; infinite-horizon discounted cost function; integral reinforcement learning (IRL); optimal control; Q-learning; value iteration (VI); ADAPTIVE OPTIMAL-CONTROL; ITERATION; SYSTEMS;

D O I：

10.1109/TCYB.2014.2322116

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.

引用

页码：165 / 176

页数：12

共 50 条

[21] Constrained infinite-horizon linear quadratic regulation discrete-time systems
Lee, Ji-Woong
Khargonekar, Pramod P.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (10) : 1951 - 1958
[22] INFINITE HORIZON LINEAR-QUADRATIC REGULATOR PROBLEMS FOR BEAMS AND PLATES
LAGNESE, JE
LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1989, 114 : 177 - 189
[23] Direct data-driven discounted infinite horizon linear quadratic regulator with robustness guarantees☆
Esmzad, Ramin
Modares, Hamidreza
AUTOMATICA, 2025, 175
[24] Linear quadratic optimal control for a class of continuous-time nonhomogeneous Markovian jump linear systems in infinite time horizon
Bai, Yuzhu
Sun, Hui-Jie
Wu, Ai-Guo
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2020, 357 (14): : 9733 - 9760
[25] Linear quadratic optimal control for a class of continuous-time nonhomogeneous Markovian jump linear systems in infinite time horizon
Bai, Yuzhu
Sun, Hui-Jie
Wu, Ai-Guo
Journal of the Franklin Institute, 2020, 357 (14): : 9733 - 9760
[26] Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments
Lian, Bosen
Xue, Wenqian
Lewis, Frank L.
Chai, Tianyou
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13083 - 13095
[27] Output Feedback Q-Learning for Linear-Quadratic Discrete-Time Finite-Horizon Control Problems
Calafiore, Giuseppe C.
Possieri, Corrado
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) : 3274 - 3281
[28] Optimal control for continuous-time linear quadratic problems with infinite Markov jump parameters
Fragoso, MD
Baczynski, J
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2001, 40 (01) : 270 - 297
[29] Stabilization of a Wheeled Inverted Pendulum by a Continuous-Time Infinite-Horizon LQG Optimal Controller
Lupian, Luis F.
Avila, Rodrigo
2008 5TH LATIN AMERICAN ROBOTICS SYMPOSIUM (LARS 2008), 2008, : 58 - 62
[30] Continuous-time linear-quadratic regulator with output feedback
Gessing, R
PROCEEDINGS OF THE 2000 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2000, : 877 - 881

← 1 2 3 4 5 →