A policy improvement method in constrained stochastic dynamic programming

被引：9

作者：

Chang, Hyeong Soo ^{[1
]}

机构：

[1] Sogang Univ, Dept Comp Sci & Engn, Seoul 121742, South Korea

[2] Sogang Univ, Program Integrated Biotechnol, Seoul 121742, South Korea

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2006年 / 51卷 / 09期

关键词：

constrained Markov decision process; dynamic programming; policy improvement; policy iteration;

D O I：

10.1109/TAC.2006.880801

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This note presents a formal method of improving a given base-policy such that the performance of the resulting policy is no worse than that of the base-policy at all states in constrained stochastic dynamic programming. We consider finite horizon and discounted infinite horizon cases. The improvement method induces a policy iteration-type algorithm that converges to a local optimal policy.

引用

下载

页码：1523 / 1526

页数：4

共 50 条

[11] A policy improvement method for constrained average Markov decision processes
Chang, Hyeong Soo
OPERATIONS RESEARCH LETTERS, 2007, 35 (04) : 434 - 438
[12] A switching control strategy for policy selection in stochastic Dynamic Programming problems☆
Tipaldi, Massimo
Iervolino, Raffaele
Massenio, Paolo Roberto
Naso, David
AUTOMATICA, 2025, 171
[13] Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming
Wei, Qinglai
Zhou, Tianmin
Lu, Jingwei
Liu, Yu
Su, Shuai
Xiao, Jun
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (10): : 6375 - 6387
[14] Policy Iteration Based Approximate Dynamic Programming Toward Autonomous Driving in Constrained Dynamic Environment
Lin, Ziyu
Ma, Jun
Duan, Jingliang
Li, Shengbo Eben
Ma, Haitong
Cheng, Bo
Lee, Tong Heng
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (05) : 5003 - 5013
[15] Constrained Unscented Dynamic Programming
Plancher, Brian
Manchester, Zachary
Kuindersma, Scott
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 5674 - 5680
[16] Constrained discounted dynamic programming
Feinberg, EA
Shwartz, A
MATHEMATICS OF OPERATIONS RESEARCH, 1996, 21 (04) : 922 - 945
[17] Computational improvement for stochastic dynamic programming models of urban water supply reservoirs
Perera, BJC
Codner, GP
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION, 1998, 34 (02): : 267 - 278
[18] Discrete dynamic convexized method for nonlinearly constrained nonlinear integer programming
Zhu, Wenxing
Ali, M. M.
COMPUTERS & OPERATIONS RESEARCH, 2009, 36 (10) : 2723 - 2728
[19] Dynamic Policy Programming
Azar, Mohammad Gheshlaghi
Gomez, Vicenc
Kappen, Hilbert J.
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3207 - 3245
[20] Dynamic Programming Method for Constrained Discrete-Time Optimal Control
C. R. Dohrmann
R. D. Robinett
Journal of Optimization Theory and Applications, 1999, 101 : 259 - 283

← 1 2 3 4 5 →