Semi-Markov decision problems and performance sensitivity analysis

被引:53
|
作者
Cao, XR [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Ctr Networking, Kowloon, Hong Kong, Peoples R China
关键词
discounted Poisson equations; discrete-event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); perturbation analysis (PA); perturbation realization; Poisson equations; policy iteration; potentials; reinforcement learning (RL);
D O I
10.1109/TAC.2003.811252
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance, potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted. factor. being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.
引用
收藏
页码:758 / 769
页数:12
相关论文
共 50 条
  • [21] Semi-markov decision processes nonstandard criteria
    Baykal-Guersoy, M.
    Guersoy, K.
    PROBABILITY IN THE ENGINEERING AND INFORMATIONAL SCIENCES, 2007, 21 (04) : 635 - 657
  • [22] Second Order Optimality in Markov and Semi-Markov Decision Processes
    Sladky, Karel
    37TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS 2019, 2019, : 338 - 343
  • [23] Optimal stopping problems for semi-Markov processes
    Z Angew Math Mech ZAMM, Suppl 3 (219):
  • [24] Optimal stopping problems for semi-Markov processes
    Boshuizen, FA
    Gouweleeuw, JM
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1996, 76 : 219 - 222
  • [25] Performance Analysis for Controlled Semi-Markov Systems with Application to Maintenance
    Huang, Yonghui
    Guo, Xianping
    Song, Xinyuan
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2011, 150 (02) : 395 - 415
  • [26] Performance Analysis for Controlled Semi-Markov Systems with Application to Maintenance
    Yonghui Huang
    Xianping Guo
    Xinyuan Song
    Journal of Optimization Theory and Applications, 2011, 150 : 395 - 415
  • [27] Performance optimization of semi-Markov decision processes with discounted-cost criteria
    Yin, Baoqun
    Li, Yanjie
    Zhou, Yaping
    Xi, Hongsheng
    EUROPEAN JOURNAL OF CONTROL, 2008, 14 (03) : 213 - 222
  • [28] Sample-Path Based Performance Sensitivity Construction of Semi-Markov Systems
    Li, Yanjie
    Zhang, Junyu
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 2449 - 2453
  • [29] Risk-sensitive semi-Markov decision problems with discounted cost and general utilities
    Bhabak, Arnab
    Saha, Subhamay
    STATISTICS & PROBABILITY LETTERS, 2022, 184
  • [30] Approximation solution and suboptimality for discounted semi-Markov decision problems with countable state space
    Hudak, D
    Nollau, V
    OPTIMIZATION, 2004, 53 (04) : 339 - 353