Mixed Markov decision processes in a semi-Markov environment with discounted criterion

被引:0
|
作者
Hu, QY [1 ]
Wang, JL
机构
[1] Xidian Univ, Sch Econ & Management, Xian 710071, Peoples R China
[2] Zhengzhou Inst Technol, Dept Math & Phys, Zhengzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1006/jmaa.1997.5792
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper presents a new model: the mixed Markov decision process (MDP) in a semi-Markov environment with discounted criterion. It describes a system which behaves like a MDP except that the system is influenced by its semi-Markov process environment. Following each state transition of the environment, the MDP model changes among discrete time MDP, continuous time MDP, and semi-MDP. After presenting the model, we show the validity of the optimality equation and the existence of epsilon-optimal policies. Finally, the mixed MDP in a Markov environment is transformed into a discrete time MDP. (C) 1998 Academic Press.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条