Continuous-Time Markov Decision Processes with Exponential Utility

被引:28
|
作者
Zhang, Yi [1 ]
机构
[1] Univ Liverpool, Dept Mat Sci, Liverpool L69 7ZL, Merseyside, England
关键词
continuous-time Markov decision processes; exponential utility; total undiscounted criteria; risk-sensitive criterion; optimality equation; SEMI-MARKOV;
D O I
10.1137/16M1086261
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We essentially do not need to impose a condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive.
引用
收藏
页码:2636 / 2660
页数:25
相关论文
共 50 条
  • [11] Average cost criterion induced by the regular utility function for continuous-time Markov decision processes
    Wei, Qingda
    Chen, Xian
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2017, 27 (03): : 501 - 524
  • [12] Average cost criterion induced by the regular utility function for continuous-time Markov decision processes
    Qingda Wei
    Xian Chen
    Discrete Event Dynamic Systems, 2017, 27 : 501 - 524
  • [13] EXPONENTIAL CONVERGENCE IN UNDISCOUNTED CONTINUOUS-TIME MARKOV DECISION CHAINS.
    Zijm, W.H.M.
    1600, (12):
  • [14] Constrained continuous-time Markov decision processes with average criteria
    Lanlan Zhang
    Xianping Guo
    Mathematical Methods of Operations Research, 2008, 67 : 323 - 340
  • [15] Constrained Continuous-Time Markov Decision Processes on the Finite Horizon
    Xianping Guo
    Yonghui Huang
    Yi Zhang
    Applied Mathematics & Optimization, 2017, 75 : 317 - 341
  • [16] Constrained continuous-time Markov decision processes with average criteria
    Zhang, Lanlan
    Guo, Xianping
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2008, 67 (02) : 323 - 340
  • [17] Bisimulation and logical preservation for continuous-time Markov decision processes
    Neuhaeusser, Martin R.
    Katoen, Joost-Pieter
    CONCUR 2007 - CONCURRENCY THEORY, PROCEEDINGS, 2007, 4703 : 412 - +
  • [18] Bisimulations and Logical Characterizations on Continuous-Time Markov Decision Processes
    Song, Lei
    Zhang, Lijun
    Godskesen, Jens Chr.
    VERIFICATION, MODEL CHECKING, AND ABSTRACT INTERPRETATION: (VMCAI 2014), 2014, 8318 : 98 - 117
  • [19] Bias optimality for multichain continuous-time Markov decision processes
    Guo, Xianping
    Song, XinYuan
    Zhang, Junyu
    OPERATIONS RESEARCH LETTERS, 2009, 37 (05) : 317 - 321
  • [20] A survey of recent results on continuous-time Markov decision processes
    Guo, Xianping
    Hernandez-Lerma, Onesimo
    Prieto-Rumeau, Tomas
    TOP, 2006, 14 (02) : 177 - 243