CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE

被引:0
|
作者
郑少慧 [1 ]
机构
[1] Shandong Mining Institute
关键词
CONTINUOUS TIME MARKOV DECISION PROGRAMMING WITH AVERAGE REWARD CRITERION AND UNBOUNDED REWARD RATE; CTMDP;
D O I
暂无
中图分类号
学科分类号
摘要
This paper deals with the continuous time Markov decision programming (briefly CTMDP) withunbounded reward rate.The economic criterion is the long-run average reward. To the models withcountable state space,and compact metric action sets,we present a set of sufficient conditions to ensurethe existence of the stationary optimal policies.
引用
收藏
页码:6 / 16
页数:11
相关论文
共 50 条
  • [1] MULTIOBJECTIVE MARKOV DECISION-PROCESS WITH AVERAGE REWARD CRITERION
    DURINOVIC, S
    LEE, HM
    KATEHAKIS, MN
    FILAR, JA
    [J]. LARGE SCALE SYSTEMS IN INFORMATION AND DECISION TECHNOLOGIES, 1986, 10 (03): : 215 - 226
  • [2] REVERSIBLE MARKOV DECISION PROCESSES WITH AN AVERAGE-REWARD CRITERION
    Cogill, Randy
    Peng, Cheng
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2013, 51 (01) : 402 - 418
  • [3] Bounded parameter Markov decision processes with average reward criterion
    Tewari, Ambuj
    Bartlett, Peter L.
    [J]. LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 263 - +
  • [4] On Markov games with average reward criterion and weakly continuous transition probabilities
    Kueenle, Heinz-Uwe
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2007, 45 (06) : 2156 - 2168
  • [5] REWARD REVISION AND THE AVERAGE REWARD MARKOV DECISION-PROCESS
    WHITE, CC
    SCHERER, WT
    [J]. OR SPEKTRUM, 1987, 9 (04) : 203 - 211
  • [6] Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion
    Guo, XP
    Zhu, WP
    [J]. JOURNAL OF APPLIED PROBABILITY, 2002, 39 (02) : 233 - 250
  • [7] Continuous-time Markov decision processes with unbounded transition and discounted-reward rates
    Yan, Hao
    Zhang, Junyu
    Guo, Xianping
    [J]. STOCHASTIC ANALYSIS AND APPLICATIONS, 2008, 26 (02) : 209 - 231
  • [8] Fuzzy decision processes with an average reward criterion
    Kurano, M
    Yasuda, M
    Nakagami, JI
    Yoshida, Y
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 1999, 30 (7-8) : 7 - 20
  • [9] Optimal control of average reward constrained continuous-time finite Markov Decision Processes
    Feinberg, EA
    [J]. PROCEEDINGS OF THE 41ST IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 2002, : 3805 - 3810
  • [10] Continuous-time Markov Decision Process with Average Reward: Using Reinforcement Learning Method
    Jia, Shengde
    Shen, Lincheng
    Xue, Hongtao
    [J]. 2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3097 - 3100