Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes

被引:4
|
作者
Huang, Yonghui [1 ]
Guo, Xianping [1 ]
机构
[1] Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China
来源
APPLIED MATHEMATICS AND OPTIMIZATION | 2015年 / 72卷 / 02期
关键词
Finite horizon semi-Markov decision processes; Mean-variance optimal policy; Dynamic programming; Value iteration; Policy improvement; Linear programming; PORTFOLIO SELECTION; RISK PROBABILITY; REWARD VARIANCE; MINIMIZATION;
D O I
10.1007/s00245-014-9278-9
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper deals with a mean-variance problem for finite horizon semi-Markov decision processes. The state and action spaces are Borel spaces, while the reward function may be unbounded. The goal is to seek an optimal policy with minimal finite horizon reward variance over the set of policies with a given mean. Using the theory of -step contraction, we give a characterization of policies with a given mean and convert the second order moment of the finite horizon reward to a mean of an infinite horizon reward/cost generated by a discrete-time Markov decision processes (MDP) with a two dimension state space and a new one-step reward/cost under suitable conditions. We then establish the optimality equation and the existence of mean-variance optimal policies by employing the existing results of discrete-time MDPs. We also provide a value iteration and a policy improvement algorithms for computing the value function and mean-variance optimal policies, respectively. In addition, a linear program and the dual program are developed for solving the mean-variance problem.
引用
收藏
页码:233 / 259
页数:27
相关论文
共 50 条