Semi-Markov decision problems and performance sensitivity analysis

被引:53
|
作者
Cao, XR [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Ctr Networking, Kowloon, Hong Kong, Peoples R China
关键词
discounted Poisson equations; discrete-event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); perturbation analysis (PA); perturbation realization; Poisson equations; policy iteration; potentials; reinforcement learning (RL);
D O I
10.1109/TAC.2003.811252
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance, potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted. factor. being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.
引用
收藏
页码:758 / 769
页数:12
相关论文
共 50 条
  • [32] Leader-follower semi-Markov decision problems: Theoretical framework and approximate solution
    Tharakunnel, Kurian
    Bhattacharyya, Siddhartha
    2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 111 - +
  • [33] CONVERGENCE OF SEMI-MARKOV WANDERINGS TO SEMI-MARKOV CONTINUOUS PROCESS
    KHARLAMOV, BP
    TEORIYA VEROYATNOSTEI I YEYE PRIMENIYA, 1975, 20 (03): : 679 - 680
  • [34] Using Semi-Markov Chains to Solve Semi-Markov Processes
    Bei Wu
    Brenda Ivette Garcia Maya
    Nikolaos Limnios
    Methodology and Computing in Applied Probability, 2021, 23 : 1419 - 1431
  • [35] SYSTEM ANALYSIS OF SEMI-MARKOV PROCESSES
    HOWARD, RA
    IEEE TRANSACTIONS ON MILITARY ELECTRONICS, 1964, MIL8 (02): : 114 - &
  • [36] SEMI-MARKOV ANALYSIS OF A BULK QUEUE
    NEUTS, MF
    BULLETIN OF THE INTERNATIONAL STATISTICAL INSTITUTE, 1965, 41 (02): : 827 - 827
  • [37] OPTIMIZATION OF DENUMERABLE SEMI-MARKOV DECISION PROCESSES.
    Staniewski, Piotr
    Weinfeld, Roman
    Systems Science, 1980, 6 (02): : 129 - 141
  • [38] Semi-Markov decision processes with variance minimization criterion
    Wei, Qingda
    Guo, Xianping
    4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2015, 13 (01): : 59 - 79
  • [40] Semi-Markov Based Maintenance Decision for Production System
    Wu, Jianlong
    Xiao, Boping
    Yang, Liying
    Zhao, Zhonghao
    2018 3RD INTERNATIONAL CONFERENCE ON SYSTEM RELIABILITY AND SAFETY (ICSRS), 2018, : 340 - 345