Deep reinforcement learning based finite-horizon optimal control for a discrete-time affine nonlinear system

被引:0
|
作者
Kim, Jong Woo [1 ]
Park, Byung Jun [1 ]
Yoo, Haeun [2 ]
Lee, Jay H. [2 ]
Lee, Jong Min [1 ]
机构
[1] Seoul Natl Univ, Sch Chem & Biol Engn, Inst Chem Proc, 1 Gwanak Ro, Seoul 08826, South Korea
[2] Korea Adv Inst Sci & Technol, Chem & Biomol Engn Dept, Daejeon 34141, South Korea
关键词
Reinforcement learning; Approximate dynamic programming; Deep learning; Actor-critic method; Finite horizon optimal control; DESIGN;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Approximate dynamic programming (ADP) aims to obtain an approximate numerical solution to the discrete time Hamilton-Jacobi-Bellman (HJB) equation. Heuristic dynamic programming (HDP) is a two-stage iterative scheme of ADP by separating the HJB equation into two equations, one for the value function and another for the policy function, which are referred to as the critic and the actor, respectively. Previous ADP implementations have been limited by the choice of function approximator, which requires significant prior domain knowledge or a large number of parameters to be fitted. However, recent advances in deep learning brought by the computer science community enable the use of deep neural networks (DNN) to approximate high-dimensional nonlinear functions without prior domain knowledge. Motivated by this, we examine the potential of DNNs as function approximators of the critic and the actor. In contrast to the infinite-horizon optimal control problem, the critic and the actor of the finite horizon optimal control (FHOC) problem are time-varying functions and have to satisfy a boundary condition. DNN structure and training algorithm suitable for FHOC are presented. Illustrative examples are provided to demonstrate the validity of the proposed method.
引用
收藏
页码:567 / 572
页数:6
相关论文
共 50 条
  • [41] LOGARITHMIC TRANSFORMATIONS FOR DISCRETE-TIME, FINITE-HORIZON STOCHASTIC-CONTROL PROBLEMS
    ALBERTINI, F
    RUNGGALDIER, WJ
    APPLIED MATHEMATICS AND OPTIMIZATION, 1988, 18 (02): : 143 - 161
  • [42] Finite-Horizon Separation-Based Covariance Control for Discrete-Time Stochastic Linear Systems
    Bakolas, Efstathios
    2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 3299 - 3304
  • [43] Neural Network-based Finite-Horizon Approximately Optimal Control of Uncertain Affine Nonlinear Continuous-time Systems
    Xu, Hao
    Zhao, Qiming
    Dierks, Travis
    Jagannathan, S.
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 1243 - 1248
  • [44] Finite-horizon, discrete-time variable tolerance H∞ filtering
    O'Brien, Richard T.
    Kiriakidis, Kiriakos
    2007 AMERICAN CONTROL CONFERENCE, VOLS 1-13, 2007, : 4581 - 4585
  • [45] Greedy Finite-Horizon Covariance Steering for Discrete-Time Stochastic Nonlinear Systems Based on the Unscented Transform
    Bakolas, Efstathios
    Tsolovikos, Alexandros
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 3595 - 3600
  • [46] Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach
    Wang, Ding
    Liu, Derong
    Wei, Qinglai
    NEUROCOMPUTING, 2012, 78 (01) : 14 - 22
  • [47] A discussion on the discrete-time finite-horizon indefinite LQ problem
    Ferrante, Augusto
    Ntogramatzidis, Lorenzo
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 216 - 220
  • [48] Finite-Horizon Near-Optimal Output Feedback Neural Network Control of Quantized Nonlinear Discrete-Time Systems With Input Constraint
    Xu, Hao
    Zhao, Qiming
    Jagannathan, Sarangapani
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (08) : 1776 - 1788
  • [49] Optimal Finite-Horizon Control with Disturbance Attenuation for Uncertain Discrete-Time T-S Fuzzy Model Based Systems
    Horng, Wen-Ren
    Chou, Jyh-Horng
    Fang, Chun-Hsiung
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 2006 - 2009
  • [50] FINITE-HORIZON ε-OPTIMAL TRACKING CONTROL OF DISCRETE-TIME LINEAR SYSTEMS USING ITERATIVE APPROXIMATE DYNAMIC PROGRAMMING
    Tan, Fuxiao
    Luo, Bin
    Guan, Xinping
    ASIAN JOURNAL OF CONTROL, 2015, 17 (01) : 176 - 189