Stationary policies with Markov partition property

被引:1
|
作者
Goulionis, John E. [1 ]
Stengos, Dimitrios I. [1 ]
Tzavelas, George [1 ]
机构
[1] Univ Piraeus, Dept Stat & Insurance Sci, 80 Karaoli & Dimitriou St, Piraeus 18534, Greece
来源
关键词
POMDPs; operation research; stationary policies; stochastic models;
D O I
10.1080/09720510.2010.10701536
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper treats the infinite horizon discounted cost control problem for partially observable Markov decision processes (POMDPs). Sondik [14] studied the class of finitely transient policies and showed that their value functions over an infinite time horizon are piecewise linear (p.w.l) and can be computed exactly by solving a system of linear equations. However the condition for finite transience is stronger than is needed to ensure p.w.l value functions. In this paper we introduce alternatively the class of periodic policies whose value functions turn out to be also p.w.l. Moreover we examine a more general condition than finite transience and periodicity. We implement these ideas in a replacement problem under Markovian deterioration, investigate for periodic policies and give numerical examples.
引用
收藏
页码:1323 / 1341
页数:19
相关论文
共 50 条