Bounding reward measures of Markov models using the Markov decision processes

被引:2
|
作者
Buchholz, Peter [1 ]
机构
[1] TU Dortmund, D-44221 Dortmund, Germany
关键词
Markov processes; stationary analysis; bounds; Markov decision processes; ALGORITHM; GMRES;
D O I
10.1002/nla.792
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
For a Markov reward process, where upper and lower bounds for the transition rates and rewards are known, a new approach to bound the expected reward is presented. Based on a previous paper where sharp bounds have been defined for the problem, but only an inefficient and unstable algorithm is proposed, this paper presents algorithms to compute the bounds by interpreting the problem as a Markov Decision Process. In this way, the well known value and policy iteration algorithms can be adopted to compute reward bounds in a stable and fairly efficient way. Different numerical techniques are presented for computing the reward bounds. Copyright (C) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:919 / 930
页数:12
相关论文
共 50 条
  • [1] Markov Decision Processes with Arbitrary Reward Processes
    Yu, Jia Yuan
    Mannor, Shie
    Shimkin, Nahum
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 737 - 757
  • [2] Markov Decision Processes with Arbitrary Reward Processes
    Yu, Jia Yuan
    Mannor, Shie
    Shimkin, Nahum
    [J]. RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 268 - +
  • [3] Markov reward models and markov decision processes in discrete and continuous time: Performance evaluation and optimization
    Gouberman, Alexander
    Siegle, Markus
    [J]. Gouberman, Alexander (alexander.gouberman@unibw.de), 1600, Springer Verlag (8453): : 156 - 241
  • [4] Partially observable Markov decision processes with reward information: Basic ideas and models
    Cao, Xi-Ren
    Guo, Xianping
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (04) : 677 - 681
  • [5] An improved method for bounding stationary measures of finite Markov processes
    Buchholz, P
    [J]. PERFORMANCE EVALUATION, 2005, 62 (1-4) : 349 - 365
  • [6] Ordinal Decision Models for Markov Decision Processes
    Weng, Paul
    [J]. 20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 828 - 833
  • [7] Average-Reward Decentralized Markov Decision Processes
    Petrik, Marek
    Zilberstein, Shlomo
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1997 - 2002
  • [8] Robust Average-Reward Markov Decision Processes
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15215 - 15223
  • [9] Functional Reward Markov Decision Processes: Theory and Applications
    Weng, Paul
    Spanjaard, Olivier
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
  • [10] CONVERGING MARKOV DECISION PROCESSES WITH MULTIPLICATIVE REWARD SYSTEM
    Fujita, Toshiharu
    [J]. Bulletin of the Kyushu Institute of Technology - Pure and Applied Mathematics, 2023, 2023 (70): : 33 - 41