Finite-horizon variance penalised Markov decision processes

被引:7
|
作者
Collins E.J. [1 ]
机构
[1] Department of Mathematics, University of Bristol
关键词
Convex polytopes; Markov decision processes; Mean-variance tradeoff; Variance penalty;
D O I
10.1007/BF01539805
中图分类号
学科分类号
摘要
We consider a finite horizon Markov decision process with only terminal rewards. We describe a finite algorithm for computing a Markov deterministic policy which maximises the variance penalised reward and we outline a vertex elimination algorithm which can reduce the computation involved. © Springer-Verlag 1997.
引用
收藏
页码:35 / 39
页数:4
相关论文
共 50 条