On average reward semi-markov decision processes with a general multichain structure

被引:22
|
作者
Jianyong, L [1 ]
Xiaobo, Z
机构
[1] Acad Sinica, Inst Appl Math, Beijing 100080, Peoples R China
[2] Tsinghua Univ, Dept Ind Engn, Beijing 100084, Peoples R China
关键词
semi-Markov decision processes; average reward criterion; multichain structure; data-transformation method; optimal policy;
D O I
10.1287/moor.1030.0077
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper we investigate average reward semi-Markov decision processes with a general multichain structure using a data-transformation method. By solving the transformed discrete-time average Markov decision processes, we can obtain significant and interesting information on the original average semi-Markov decision processes. If the original semi-Markov decision processes satisfy some appropriate conditions, then stationary optimal policies in the transformed discrete-time models are also optimal in the original semi-Markov decision processes.
引用
收藏
页码:339 / 352
页数:14
相关论文
共 50 条