A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

被引:0
|
作者
Rolando Cavazos-Cadena
Raúl Montes-de-Oca
Karel Sladký
机构
[1] Universidad Autónoma Agraria Antonio Narro,Departamento de Estadística y Cálculo
[2] Universidad Autónoma Metropolitana,Departamento de Matemáticas
[3] Institute of Information Theory and Automation,undefined
关键词
Strong sample-path optimality; Lyapunov function condition; Stationary policy; Expected average reward criterion;
D O I
暂无
中图分类号
学科分类号
摘要
This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.
引用
收藏
页码:674 / 684
页数:10
相关论文
共 50 条