Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems

被引：73

作者：

Palanisamy, Muthukumar ^{[1
,2
]}

Modares, Hamidreza ^{[2
]}

Lewis, Frank L. ^{[2
]}

Aurangzeb, Muhammad ^{[2
]}

机构：

[1] Gandhigram Rural Inst Deemed Univ, Dept Math, Gandhigram 624302, India

[2] Univ Texas Arlington Res Inst, Ft Worth, TX 76118 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2015年 / 45卷 / 02期

基金：

美国国家科学基金会;

关键词：

Approximate dynamic programming (ADP); continuous-time dynamical systems; infinite-horizon discounted cost function; integral reinforcement learning (IRL); optimal control; Q-learning; value iteration (VI); ADAPTIVE OPTIMAL-CONTROL; ITERATION; SYSTEMS;

D O I：

10.1109/TCYB.2014.2322116

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.

引用

页码：165 / 176

页数：12

共 50 条

[31] Output Feedback Reinforcement Learning Control for the Continuous-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 3417 - 3422
[32] Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1523 - 1536
[33] Discounted linear Q-learning control with novel tracking cost and its stability
Wang, Ding
Ren, Jin
Ha, Mingming
INFORMATION SCIENCES, 2023, 626 : 339 - 353
[34] Infinite-horizon optimal control based on continuous-time continuous-state Hopfield neural networks
Li, Ming-Ai
Yu, Na-Gong
Qiao, Jun-Fei
Ruan, Xiao-Gang
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2006, 4 (04) : 707 - 719
[35] Nonconvex global optimization problems: Constrained infinite-horizon linear-quadratic control problems for discrete systems
Yakubovich, VA
DIRECTIONS IN MATHEMATICAL SYSTEMS THEORY AND OPTIMIZATION, 2003, 286 : 359 - 382
[36] Infinite horizon predictive control of constrained continuous-time linear systems
Cannon, M
Kouvaritakis, B
AUTOMATICA, 2000, 36 (07) : 943 - 955
[37] Finite-horizon and infinite-horizon linear quadratic optimal control problems: A data-driven Euler scheme
Wang, Guangchen
Zhang, Heng
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (13):
[38] Adaptive linear quadratic regulator for continuous-time systems with uncertain dynamics
Jha, Sumit Kumar
Bhasin, Shubhendu
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2020, 7 (03) : 833 - 841
[39] Adaptive Linear Quadratic Regulator for Continuous-Time Systems With Uncertain Dynamics
Sumit Kumar Jha
Shubhendu Bhasin
IEEE/CAAJournalofAutomaticaSinica, 2020, 7 (03) : 833 - 841
[40] Output Feedback Reinforcement Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,

← 1 2 3 4 5 →