Policy Iteration Approximate Dynamic Programming Using Volterra Series Based Actor

被引：0

作者：

Guo, Wentao ^{[1
]}

Si, Jennie ^{[2
]}

Liu, Feng ^{[1
]}

Mei, Shengwei ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, State Key Lab Power Syst, Beijing 100084, Peoples R China

[2] Arizona State Univ, Dept Elect Engn, Tempe, AZ 85287 USA

来源：

PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2014年

关键词：

TIME NONLINEAR-SYSTEMS; ADAPTIVE CRITIC DESIGNS; ONLINE LEARNING CONTROL; FEEDBACK-CONTROL; NEURAL-NETWORKS; REINFORCEMENT; IDENTIFICATION; ALGORITHM;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

There is an extensive literature on value function approximation for approximate dynamic programming (ADP). Multilayer perceptrons (MLPs) and radial basis functions (RBFs), among others, are typical approximators for value functions in ADP. Similar approaches have been taken for policy approximation. In this paper, we propose a new Volterra series based structure for actor approximation in ADP. The Volterra approximator is linear in parameters with global optima attainable. Given the proposed approximator structures, we further develop a policy iteration framework under which a gradient descent training algorithm for obtaining the optimal Volterra kernels can be obtained. Associated with this ADP design, we provide a sufficient condition based on actor approximation error to guarantee convergence of the value function iterations. A finite bound of the final convergent value function is also given. Finally, by using a simulation example we illustrate the effectiveness of the proposed Volterra actor for optimal control of a nonlinear system.

引用

页码：249 / 255

页数：7

共 50 条

[41] Empirical model based control of nonlinear processes using approximate dynamic programming
Lee, JM
Lee, JH
[J]. PROCEEDINGS OF THE 2004 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2004, : 3041 - 3046
[42] Approximate Dynamic Programming for Selective Maintenance in Series-Parallel Systems
Ahadi, Khatereh
Sullivan, Kelly M.
[J]. IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (03) : 1147 - 1164
[43] Modeling pharmacokinetics/dynamic data using Volterra series
Verotta, Davide
[J]. Annals of Biomedical Engineering, 2000, 28 (SUPPL. 1)
[44] Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms
Zhang, Huaguang
Jiang, He
Luo, Chaomin
Xiao, Geyang
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3331 - 3340
[45] State Aggregation based Linear Programming approach to Approximate Dynamic Programming
Darbha, S.
Krishnamoorthy, K.
Pachter, M.
Chandler, P.
[J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 935 - 941
[46] Approximate Dynamic Programming Using Support Vector Regression
Bethke, Brett
How, Jonathan P.
Ozdaglar, Asuman
[J]. 47TH IEEE CONFERENCE ON DECISION AND CONTROL, 2008 (CDC 2008), 2008, : 3811 - 3816
[47] Patient admission planning using Approximate Dynamic Programming
Peter J. H. Hulshof
Martijn R. K. Mes
Richard J. Boucherie
Erwin W. Hans
[J]. Flexible Services and Manufacturing Journal, 2016, 28 : 30 - 61
[48] Patient admission planning using Approximate Dynamic Programming
Hulshof, Peter J. H.
Mes, Martijn R. K.
Boucherie, Richard J.
Hans, Erwin W.
[J]. FLEXIBLE SERVICES AND MANUFACTURING JOURNAL, 2016, 28 (1-2) : 30 - 61
[49] Region-based approximation in approximate dynamic programming
Sardarmehni, Tohid
Song, Xingyong
[J]. INTERNATIONAL JOURNAL OF CONTROL, 2024, 97 (02) : 306 - 315
[50] Microgrid Energy Management based on Approximate Dynamic Programming
Strelec, Martin
Berka, Jan
[J]. 2013 4TH IEEE/PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT EUROPE), 2013,

← 1 2 3 4 5 →