Policy Iteration Approximate Dynamic Programming Using Volterra Series Based Actor

被引:0
|
作者
Guo, Wentao [1 ]
Si, Jennie [2 ]
Liu, Feng [1 ]
Mei, Shengwei [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, State Key Lab Power Syst, Beijing 100084, Peoples R China
[2] Arizona State Univ, Dept Elect Engn, Tempe, AZ 85287 USA
关键词
TIME NONLINEAR-SYSTEMS; ADAPTIVE CRITIC DESIGNS; ONLINE LEARNING CONTROL; FEEDBACK-CONTROL; NEURAL-NETWORKS; REINFORCEMENT; IDENTIFICATION; ALGORITHM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is an extensive literature on value function approximation for approximate dynamic programming (ADP). Multilayer perceptrons (MLPs) and radial basis functions (RBFs), among others, are typical approximators for value functions in ADP. Similar approaches have been taken for policy approximation. In this paper, we propose a new Volterra series based structure for actor approximation in ADP. The Volterra approximator is linear in parameters with global optima attainable. Given the proposed approximator structures, we further develop a policy iteration framework under which a gradient descent training algorithm for obtaining the optimal Volterra kernels can be obtained. Associated with this ADP design, we provide a sufficient condition based on actor approximation error to guarantee convergence of the value function iterations. A finite bound of the final convergent value function is also given. Finally, by using a simulation example we illustrate the effectiveness of the proposed Volterra actor for optimal control of a nonlinear system.
引用
收藏
页码:249 / 255
页数:7
相关论文
共 50 条
  • [41] Empirical model based control of nonlinear processes using approximate dynamic programming
    Lee, JM
    Lee, JH
    [J]. PROCEEDINGS OF THE 2004 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2004, : 3041 - 3046
  • [42] Approximate Dynamic Programming for Selective Maintenance in Series-Parallel Systems
    Ahadi, Khatereh
    Sullivan, Kelly M.
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (03) : 1147 - 1164
  • [43] Modeling pharmacokinetics/dynamic data using Volterra series
    Verotta, Davide
    [J]. Annals of Biomedical Engineering, 2000, 28 (SUPPL. 1)
  • [44] Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms
    Zhang, Huaguang
    Jiang, He
    Luo, Chaomin
    Xiao, Geyang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3331 - 3340
  • [45] State Aggregation based Linear Programming approach to Approximate Dynamic Programming
    Darbha, S.
    Krishnamoorthy, K.
    Pachter, M.
    Chandler, P.
    [J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 935 - 941
  • [46] Approximate Dynamic Programming Using Support Vector Regression
    Bethke, Brett
    How, Jonathan P.
    Ozdaglar, Asuman
    [J]. 47TH IEEE CONFERENCE ON DECISION AND CONTROL, 2008 (CDC 2008), 2008, : 3811 - 3816
  • [47] Patient admission planning using Approximate Dynamic Programming
    Peter J. H. Hulshof
    Martijn R. K. Mes
    Richard J. Boucherie
    Erwin W. Hans
    [J]. Flexible Services and Manufacturing Journal, 2016, 28 : 30 - 61
  • [48] Patient admission planning using Approximate Dynamic Programming
    Hulshof, Peter J. H.
    Mes, Martijn R. K.
    Boucherie, Richard J.
    Hans, Erwin W.
    [J]. FLEXIBLE SERVICES AND MANUFACTURING JOURNAL, 2016, 28 (1-2) : 30 - 61
  • [49] Region-based approximation in approximate dynamic programming
    Sardarmehni, Tohid
    Song, Xingyong
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 2024, 97 (02) : 306 - 315
  • [50] Microgrid Energy Management based on Approximate Dynamic Programming
    Strelec, Martin
    Berka, Jan
    [J]. 2013 4TH IEEE/PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT EUROPE), 2013,