A policy improvement method in constrained stochastic dynamic programming

被引:9
|
作者
Chang, Hyeong Soo [1 ]
机构
[1] Sogang Univ, Dept Comp Sci & Engn, Seoul 121742, South Korea
[2] Sogang Univ, Program Integrated Biotechnol, Seoul 121742, South Korea
关键词
constrained Markov decision process; dynamic programming; policy improvement; policy iteration;
D O I
10.1109/TAC.2006.880801
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This note presents a formal method of improving a given base-policy such that the performance of the resulting policy is no worse than that of the base-policy at all states in constrained stochastic dynamic programming. We consider finite horizon and discounted infinite horizon cases. The improvement method induces a policy iteration-type algorithm that converges to a local optimal policy.
引用
下载
收藏
页码:1523 / 1526
页数:4
相关论文
共 50 条
  • [12] A switching control strategy for policy selection in stochastic Dynamic Programming problems☆
    Tipaldi, Massimo
    Iervolino, Raffaele
    Massenio, Paolo Roberto
    Naso, David
    AUTOMATICA, 2025, 171
  • [13] Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming
    Wei, Qinglai
    Zhou, Tianmin
    Lu, Jingwei
    Liu, Yu
    Su, Shuai
    Xiao, Jun
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (10): : 6375 - 6387
  • [14] Policy Iteration Based Approximate Dynamic Programming Toward Autonomous Driving in Constrained Dynamic Environment
    Lin, Ziyu
    Ma, Jun
    Duan, Jingliang
    Li, Shengbo Eben
    Ma, Haitong
    Cheng, Bo
    Lee, Tong Heng
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (05) : 5003 - 5013
  • [15] Constrained Unscented Dynamic Programming
    Plancher, Brian
    Manchester, Zachary
    Kuindersma, Scott
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 5674 - 5680
  • [16] Constrained discounted dynamic programming
    Feinberg, EA
    Shwartz, A
    MATHEMATICS OF OPERATIONS RESEARCH, 1996, 21 (04) : 922 - 945
  • [17] Computational improvement for stochastic dynamic programming models of urban water supply reservoirs
    Perera, BJC
    Codner, GP
    JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION, 1998, 34 (02): : 267 - 278
  • [18] Discrete dynamic convexized method for nonlinearly constrained nonlinear integer programming
    Zhu, Wenxing
    Ali, M. M.
    COMPUTERS & OPERATIONS RESEARCH, 2009, 36 (10) : 2723 - 2728
  • [19] Dynamic Policy Programming
    Azar, Mohammad Gheshlaghi
    Gomez, Vicenc
    Kappen, Hilbert J.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3207 - 3245
  • [20] Dynamic Programming Method for Constrained Discrete-Time Optimal Control
    C. R. Dohrmann
    R. D. Robinett
    Journal of Optimization Theory and Applications, 1999, 101 : 259 - 283