Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration

被引：0

作者：

Akian, Marianne ^{[1
,2
]}

Gaubert, Stephane ^{[1
,2
]}

Qu, Zheng ^{[3
]}

Saadi, Omar ^{[1
,2
]}

机构：

[1] Ecole Polytech, INRIA, F-91128 Palaiseau, France

[2] Ecole Polytech, CMAP, Route Saclay, F-91128 Palaiseau, France

[3] Univ Hong Kong, Dept Math, Room 419,Run Run Shaw Bldg,Pokfulam Rd, Hong Kong, Peoples R China

来源：

2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).

引用

页码：5963 / 5970

页数：8

共 50 条

[11] Zero-sum stochastic games with partial information
Ghosh, MK
McDonald, D
Sinha, S
[J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2004, 121 (01) : 99 - 118
[12] Zero-Sum Stochastic Games with Partial Information
M. K. Ghosh
D. McDonald
S. Sinha
[J]. Journal of Optimization Theory and Applications, 2004, 121 : 99 - 118
[13] Zero-sum ergodic stochastic games with Feller transition probabilities
Jaskiewicz, Anna
Nowak, Andrzej S.
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2006, 45 (03) : 773 - 789
[14] Optimal strategies in a class of zero-sum ergodic stochastic games
Andrzej S. Nowak
[J]. Mathematical Methods of Operations Research, 1999, 50 : 399 - 419
[15] Optimal strategies in a class of zero-sum ergodic stochastic games
Nowak, AS
[J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1999, 50 (03) : 399 - 419
[16] New Algorithms for Solving Zero-Sum Stochastic Games
Oliu-Barton, Miquel
[J]. MATHEMATICS OF OPERATIONS RESEARCH, 2021, 46 (01) : 255 - 267
[17] Generic uniqueness of the bias vector of finite zero-sum stochastic games with perfect information
Akian, Marianne
Gaubert, Stephane
Hochart, Antoine
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2018, 457 (02) : 1038 - 1064
[18] General limit value in zero-sum stochastic games
Ziliotto, Bruno
[J]. INTERNATIONAL JOURNAL OF GAME THEORY, 2016, 45 (1-2) : 353 - 374
[19] General limit value in zero-sum stochastic games
Bruno Ziliotto
[J]. International Journal of Game Theory, 2016, 45 : 353 - 374
[20] Information Structures and Values in Zero-Sum Stochastic Games
Nayyar, Ashutosh
Gupta, Abhishek
[J]. 2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 3658 - 3663

← 1 2 3 4 5 →