Semi-Markov decision problems and performance sensitivity analysis

被引：53

作者：

Cao, XR ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Ctr Networking, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2003年 / 48卷 / 05期

关键词：

discounted Poisson equations; discrete-event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); perturbation analysis (PA); perturbation realization; Poisson equations; policy iteration; potentials; reinforcement learning (RL);

D O I：

10.1109/TAC.2003.811252

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance, potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted. factor. being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.

引用

页码：758 / 769

页数：12

共 50 条

[21] Semi-markov decision processes nonstandard criteria
Baykal-Guersoy, M.
Guersoy, K.
PROBABILITY IN THE ENGINEERING AND INFORMATIONAL SCIENCES, 2007, 21 (04) : 635 - 657
[22] Second Order Optimality in Markov and Semi-Markov Decision Processes
Sladky, Karel
37TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS 2019, 2019, : 338 - 343
[23] Optimal stopping problems for semi-Markov processes
Z Angew Math Mech ZAMM, Suppl 3 (219):
[24] Optimal stopping problems for semi-Markov processes
Boshuizen, FA
Gouweleeuw, JM
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1996, 76 : 219 - 222
[25] Performance Analysis for Controlled Semi-Markov Systems with Application to Maintenance
Huang, Yonghui
Guo, Xianping
Song, Xinyuan
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2011, 150 (02) : 395 - 415
[26] Performance Analysis for Controlled Semi-Markov Systems with Application to Maintenance
Yonghui Huang
Xianping Guo
Xinyuan Song
Journal of Optimization Theory and Applications, 2011, 150 : 395 - 415
[27] Performance optimization of semi-Markov decision processes with discounted-cost criteria
Yin, Baoqun
Li, Yanjie
Zhou, Yaping
Xi, Hongsheng
EUROPEAN JOURNAL OF CONTROL, 2008, 14 (03) : 213 - 222
[28] Sample-Path Based Performance Sensitivity Construction of Semi-Markov Systems
Li, Yanjie
Zhang, Junyu
PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 2449 - 2453
[29] Risk-sensitive semi-Markov decision problems with discounted cost and general utilities
Bhabak, Arnab
Saha, Subhamay
STATISTICS & PROBABILITY LETTERS, 2022, 184
[30] Approximation solution and suboptimality for discounted semi-Markov decision problems with countable state space
Hudak, D
Nollau, V
OPTIMIZATION, 2004, 53 (04) : 339 - 353

← 1 2 3 4 5 →