Semi-Markov decision problems and performance sensitivity analysis

被引:53
|
作者
Cao, XR [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Ctr Networking, Kowloon, Hong Kong, Peoples R China
关键词
discounted Poisson equations; discrete-event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); perturbation analysis (PA); perturbation realization; Poisson equations; policy iteration; potentials; reinforcement learning (RL);
D O I
10.1109/TAC.2003.811252
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance, potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted. factor. being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.
引用
收藏
页码:758 / 769
页数:12
相关论文
共 50 条
  • [1] Sensitivity analysis of performance for semi-Markov processes
    Yin, BQ
    Xi, HS
    Zhou, YP
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 2347 - 2350
  • [2] Performance Sensitivity Analysis and Optimization for a Class of Countable Semi-Markov Decision Processes
    Kang, Yu
    Yin, Baoqun
    Shang, Weike
    Xi, Hongsheng
    2011 9TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2011), 2011, : 799 - 804
  • [4] Towards Analysis of Semi-Markov Decision Processes
    Chen, Taolue
    Lu, Jian
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT I, 2010, 6319 : 41 - +
  • [5] Error bounds and sensitivity analysis of semi-Markov processes
    Sladky, K
    OPERATIONS RESEARCH PROCEEDINGS 1999, 2000, : 148 - 153
  • [6] OBSERVABLE AUGMENTED SYSTEMS FOR SENSITIVITY ANALYSIS OF MARKOV AND SEMI-MARKOV PROCESSES
    CASSANDRAS, CG
    STRICKLAND, SG
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (10) : 1026 - 1037
  • [8] Optimal replacement of a system according to a semi-Markov decision process in a semi-Markov environment
    Hu, QY
    Yue, WY
    OPTIMIZATION METHODS & SOFTWARE, 2003, 18 (02): : 181 - 196
  • [9] A unified approach to Markov decision problems and performance sensitivity analysis
    Cao, XR
    AUTOMATICA, 2000, 36 (05) : 771 - 774
  • [10] A basic formula for performance gradient estimation of semi-Markov decision processes
    Li, Yanjie
    Cao, Fang
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2013, 224 (02) : 333 - 339