Improving strategies in stochastic games

被引:0
|
作者
Flesch, J [1 ]
Thuijsman, F [1 ]
Vrieze, OJ [1 ]
机构
[1] Maastricht Univ, Dept Math, NL-6200 MD Maastricht, Netherlands
来源
PROCEEDINGS OF THE 37TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4 | 1998年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a zero-sum limiting average stochastic game, we evaluate a strategy pi for the maximizing player, player 1, by the reward phi(s)(pi) that pi guarantees to him when starting in state s. A strategy pi is called non-improving if phi(s)(pi) greater than or equal to phi(s)(pi[h]) for any state s and for any finite history h, where pi[h] is the strategy pi conditional on the history h; otherwise the strategy is called improving. We investigate the use of improving and non-improving strategies, and explore the relation between (non-)improvingness and (epsilon-)optimality. Improving strategies appear to play a very important role for obtaining E-optimality, while 0-optimal strategies are always non-improving. Several examples will clarify all these issues.
引用
收藏
页码:2674 / 2679
页数:6
相关论文
共 50 条