Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems

被引:73
|
作者
Palanisamy, Muthukumar [1 ,2 ]
Modares, Hamidreza [2 ]
Lewis, Frank L. [2 ]
Aurangzeb, Muhammad [2 ]
机构
[1] Gandhigram Rural Inst Deemed Univ, Dept Math, Gandhigram 624302, India
[2] Univ Texas Arlington Res Inst, Ft Worth, TX 76118 USA
基金
美国国家科学基金会;
关键词
Approximate dynamic programming (ADP); continuous-time dynamical systems; infinite-horizon discounted cost function; integral reinforcement learning (IRL); optimal control; Q-learning; value iteration (VI); ADAPTIVE OPTIMAL-CONTROL; ITERATION; SYSTEMS;
D O I
10.1109/TCYB.2014.2322116
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.
引用
收藏
页码:165 / 176
页数:12
相关论文
共 50 条
  • [31] Output Feedback Reinforcement Learning Control for the Continuous-Time Linear Quadratic Regulator Problem
    Rizvi, Syed Ali Asad
    Lin, Zongli
    2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 3417 - 3422
  • [32] Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
    Rizvi, Syed Ali Asad
    Lin, Zongli
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1523 - 1536
  • [33] Discounted linear Q-learning control with novel tracking cost and its stability
    Wang, Ding
    Ren, Jin
    Ha, Mingming
    INFORMATION SCIENCES, 2023, 626 : 339 - 353
  • [34] Infinite-horizon optimal control based on continuous-time continuous-state Hopfield neural networks
    Li, Ming-Ai
    Yu, Na-Gong
    Qiao, Jun-Fei
    Ruan, Xiao-Gang
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2006, 4 (04) : 707 - 719
  • [35] Nonconvex global optimization problems: Constrained infinite-horizon linear-quadratic control problems for discrete systems
    Yakubovich, VA
    DIRECTIONS IN MATHEMATICAL SYSTEMS THEORY AND OPTIMIZATION, 2003, 286 : 359 - 382
  • [36] Infinite horizon predictive control of constrained continuous-time linear systems
    Cannon, M
    Kouvaritakis, B
    AUTOMATICA, 2000, 36 (07) : 943 - 955
  • [37] Finite-horizon and infinite-horizon linear quadratic optimal control problems: A data-driven Euler scheme
    Wang, Guangchen
    Zhang, Heng
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (13):
  • [38] Adaptive linear quadratic regulator for continuous-time systems with uncertain dynamics
    Jha, Sumit Kumar
    Bhasin, Shubhendu
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2020, 7 (03) : 833 - 841
  • [39] Adaptive Linear Quadratic Regulator for Continuous-Time Systems With Uncertain Dynamics
    Sumit Kumar Jha
    Shubhendu Bhasin
    IEEE/CAAJournalofAutomaticaSinica, 2020, 7 (03) : 833 - 841
  • [40] Output Feedback Reinforcement Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
    Rizvi, Syed Ali Asad
    Lin, Zongli
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,