Bounding reward measures of Markov models using the Markov decision processes

被引：2

作者：

Buchholz, Peter ^{[1
]}

机构：

[1] TU Dortmund, D-44221 Dortmund, Germany

来源：

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS | 2011年 / 18卷 / 06期

关键词：

Markov processes; stationary analysis; bounds; Markov decision processes; ALGORITHM; GMRES;

D O I：

10.1002/nla.792

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

For a Markov reward process, where upper and lower bounds for the transition rates and rewards are known, a new approach to bound the expected reward is presented. Based on a previous paper where sharp bounds have been defined for the problem, but only an inefficient and unstable algorithm is proposed, this paper presents algorithms to compute the bounds by interpreting the problem as a Markov Decision Process. In this way, the well known value and policy iteration algorithms can be adopted to compute reward bounds in a stable and fairly efficient way. Different numerical techniques are presented for computing the reward bounds. Copyright (C) 2011 John Wiley & Sons, Ltd.

引用

页码：919 / 930

页数：12

共 50 条

[1] Markov Decision Processes with Arbitrary Reward Processes
Yu, Jia Yuan
Mannor, Shie
Shimkin, Nahum
[J]. MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 737 - 757
[2] Markov Decision Processes with Arbitrary Reward Processes
Yu, Jia Yuan
Mannor, Shie
Shimkin, Nahum
[J]. RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 268 - +
[3] Markov reward models and markov decision processes in discrete and continuous time: Performance evaluation and optimization
Gouberman, Alexander
Siegle, Markus
[J]. Gouberman, Alexander (alexander.gouberman@unibw.de), 1600, Springer Verlag (8453): : 156 - 241
[4] Partially observable Markov decision processes with reward information: Basic ideas and models
Cao, Xi-Ren
Guo, Xianping
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (04) : 677 - 681
[5] An improved method for bounding stationary measures of finite Markov processes
Buchholz, P
[J]. PERFORMANCE EVALUATION, 2005, 62 (1-4) : 349 - 365
[6] Ordinal Decision Models for Markov Decision Processes
Weng, Paul
[J]. 20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 828 - 833
[7] Average-Reward Decentralized Markov Decision Processes
Petrik, Marek
Zilberstein, Shlomo
[J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1997 - 2002
[8] Robust Average-Reward Markov Decision Processes
Wang, Yue
Velasquez, Alvaro
Atia, George
Prater-Bennette, Ashley
Zou, Shaofeng
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15215 - 15223
[9] Functional Reward Markov Decision Processes: Theory and Applications
Weng, Paul
Spanjaard, Olivier
[J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
[10] CONVERGING MARKOV DECISION PROCESSES WITH MULTIPLICATIVE REWARD SYSTEM
Fujita, Toshiharu
[J]. Bulletin of the Kyushu Institute of Technology - Pure and Applied Mathematics, 2023, 2023 (70): : 33 - 41

← 1 2 3 4 5 →