A Sensitivity-Based Construction Approach to Variance Minimization of Markov Decision Processes

被引:1
|
作者
Huang, Yonghao [1 ]
Chen, Xi [2 ]
机构
[1] Chinese Insurance Informat Technol Co Ltd, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Variance minimization; sensitivity-based approach; Markov decision process; SAMPLE-PATH OPTIMALITY; PERFORMANCE SENSITIVITIES; OPTIMIZATION;
D O I
10.1002/asjc.1875
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With a long-run average performance as the primary criterion for a Markov decision process, variance measures are studied as its secondary criteria. The steady-state variance and the limiting average variance along a sample path are discussed. The latter one is difficult to handle due to its special form. With a sensitivity-based approach, the difference formula for the sample-path variance under different policies is intuitively constructed and then the optimality equation is presented. Moreover a policy iteration algorithm is developed. This work extends the sensitivity-based construction approach to Markov decision processes with non-standard performance criteria. The difference between these two types of variance and bias criteria is illustrated with a numerical example.
引用
收藏
页码:1166 / 1178
页数:13
相关论文
共 50 条
  • [31] Tuning LQR Controllers: A Sensitivity-Based Approach
    Masti, Daniele
    Zanon, Mario
    Bemporad, Alberto
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 932 - 937
  • [33] Computational approaches to variance-penalized Markov decision processes
    Yen, Ya-Chin
    [J]. JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2005, 8 (02) : 295 - 304
  • [34] Finite-horizon variance penalised Markov decision processes
    Collins, EJ
    [J]. OR SPEKTRUM, 1997, 19 (01): : 35 - 39
  • [35] Finite-horizon variance penalised Markov decision processes
    Collins E.J.
    [J]. Operations-Research-Spektrum, 1997, 19 (1): : 35 - 39
  • [36] A sensitivity-based approach for pruning architecture of Madalines
    Zeng, Xiaoqin
    Shao, Jing
    Wang, Yingfeng
    Zhong, Shuiming
    [J]. NEURAL COMPUTING & APPLICATIONS, 2009, 18 (08): : 957 - 965
  • [37] Stochastic Learning and Optimization: A Sensitivity-Based Approach
    Zilinskas, Antanas
    [J]. INTERFACES, 2009, 39 (02) : 172 - 174
  • [38] A sensitivity-based approach for pruning architecture of Madalines
    Xiaoqin Zeng
    Jing Shao
    Yingfeng Wang
    Shuiming Zhong
    [J]. Neural Computing and Applications, 2009, 18 : 957 - 965
  • [39] A sensitivity view of Markov decision processes and reinforcement learning
    Cao, XR
    [J]. MODELING, CONTROL AND OPTIMIZATION OF COMPLEX SYSTEMS: IN HONOR OF PROFESSOR YU-CHI HO, 2003, 14 : 261 - 283
  • [40] An Approach for Dynamic Selection of Synthesis Transformations based on Markov Decision Processes
    Welp, Tobias
    Kuehlmann, Andreas
    [J]. 2011 DESIGN, AUTOMATION & TEST IN EUROPE (DATE), 2011, : 1533 - 1536