Semi-Markov decision problems and performance sensitivity analysis

被引：53

作者：

Cao, XR ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Ctr Networking, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2003年 / 48卷 / 05期

关键词：

discounted Poisson equations; discrete-event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); perturbation analysis (PA); perturbation realization; Poisson equations; policy iteration; potentials; reinforcement learning (RL);

D O I：

10.1109/TAC.2003.811252

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance, potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted. factor. being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.

引用

页码：758 / 769

页数：12

共 50 条

[1] Sensitivity analysis of performance for semi-Markov processes
Yin, BQ
Xi, HS
Zhou, YP
PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 2347 - 2350
[2] Performance Sensitivity Analysis and Optimization for a Class of Countable Semi-Markov Decision Processes
Kang, Yu
Yin, Baoqun
Shang, Weike
Xi, Hongsheng
2011 9TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2011), 2011, : 799 - 804
[3] EVENT COUPLING AND PERFORMANCE SENSITIVITY ANALYSIS OF GENERALIZED SEMI-MARKOV PROCESSES
CAO, XR
ADVANCES IN APPLIED PROBABILITY, 1995, 27 (03) : 741 - 769
[4] Towards Analysis of Semi-Markov Decision Processes
Chen, Taolue
Lu, Jian
ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT I, 2010, 6319 : 41 - +
[5] Error bounds and sensitivity analysis of semi-Markov processes
Sladky, K
OPERATIONS RESEARCH PROCEEDINGS 1999, 2000, : 148 - 153
[6] OBSERVABLE AUGMENTED SYSTEMS FOR SENSITIVITY ANALYSIS OF MARKOV AND SEMI-MARKOV PROCESSES
CASSANDRAS, CG
STRICKLAND, SG
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (10) : 1026 - 1037
[7] Observable augmented systems for sensitivity analysis of Markov and semi-Markov processes
Cassandras, Christos G., 1600, (34):
[8] Optimal replacement of a system according to a semi-Markov decision process in a semi-Markov environment
Hu, QY
Yue, WY
OPTIMIZATION METHODS & SOFTWARE, 2003, 18 (02): : 181 - 196
[9] A unified approach to Markov decision problems and performance sensitivity analysis
Cao, XR
AUTOMATICA, 2000, 36 (05) : 771 - 774
[10] A basic formula for performance gradient estimation of semi-Markov decision processes
Li, Yanjie
Cao, Fang
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2013, 224 (02) : 333 - 339

← 1 2 3 4 5 →