Continuous-Time Markov Decision Processes with Exponential Utility

被引:28
|
作者
Zhang, Yi [1 ]
机构
[1] Univ Liverpool, Dept Mat Sci, Liverpool L69 7ZL, Merseyside, England
关键词
continuous-time Markov decision processes; exponential utility; total undiscounted criteria; risk-sensitive criterion; optimality equation; SEMI-MARKOV;
D O I
10.1137/16M1086261
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We essentially do not need to impose a condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive.
引用
收藏
页码:2636 / 2660
页数:25
相关论文
共 50 条
  • [22] Constrained total undiscounted continuous-time Markov decision processes
    Guo, Xianping
    Zhang, Yi
    BERNOULLI, 2017, 23 (03) : 1694 - 1736
  • [23] Constrained Continuous-Time Markov Decision Processes on the Finite Horizon
    Guo, Xianping
    Huang, Yonghui
    Zhang, Yi
    APPLIED MATHEMATICS AND OPTIMIZATION, 2017, 75 (02): : 317 - 341
  • [24] A survey of recent results on continuous-time Markov decision processes
    Xianping Guo
    Onésimo Hernández-Lerma
    Tomás Prieto-Rumeau
    Xi-Ren Cao
    Junyu Zhang
    Qiying Hu
    Mark E. Lewis
    Ricardo Vélez
    TOP, 2006, 14 : 177 - 261
  • [25] A characterization of meaningful schedulers for continuous-time Markov decision processes
    Wolovick, Nicolas
    Johr, Sven
    FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, 2006, 4202 : 352 - 367
  • [26] Policy learning in continuous-time Markov decision processes using Gaussian Processes
    Bartocci, Ezio
    Bortolussi, Luca
    Brazdil, Tomas
    Milios, Dimitrios
    Sanguinetti, Guido
    PERFORMANCE EVALUATION, 2017, 116 : 84 - 100
  • [27] Discounted optimality for continuous-time Markov decision processes in Polish spaces
    Guo, Xianping
    2006 CHINESE CONTROL CONFERENCE, VOLS 1-5, 2006, : 1785 - 1787
  • [28] Average optimality for continuous-time Markov decision processes in Polish spaces
    Guo, Xianping
    Rieder, Ulrich
    ANNALS OF APPLIED PROBABILITY, 2006, 16 (02): : 730 - 756
  • [29] On continuous-time Markov processes in bargaining
    Houba, Harold
    ECONOMICS LETTERS, 2008, 100 (02) : 280 - 283
  • [30] ABSORBING CONTINUOUS-TIME MARKOV DECISION PROCESSES WITH TOTAL COST CRITERIA
    Guo, Xianping
    Vykertas, Mantas
    Zhang, Yi
    ADVANCES IN APPLIED PROBABILITY, 2013, 45 (02) : 490 - 519