Solving multichain stochastic games with mean payoff by policy iteration

被引:0
|
作者
Akian, Marianne [1 ,2 ]
Cochet-Terrasson, Jean
Detournay, Sylvie [1 ,2 ]
Gaubert, Stephane [1 ,2 ]
机构
[1] Ecole Polytech, INRIA Saclay Ile De France, F-91128 Palaiseau, France
[2] Ecole Polytech, CMAP, F-91128 Palaiseau, France
关键词
THEOREM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Zero-sum stochastic games with finite state and action spaces, perfect information, and mean payoff criteria arise in particular from the monotone discretization of mean-payoff pursuit-evasion deterministic differential games. In that case no irreducibility assumption on the Markov chains associated to strategies are satisfied (multichain games). The value of such a game can be characterized by a system of nonlinear equations, involving the mean payoff vector and an auxiliary vector (relative value or bias). Cochet-Terrasson and Gaubert proposed in (C. R. Math. Acad. Sci. Paris, 2006) a policy iteration algorithm relying on a notion of nonlinear spectral projection (Akian and Gaubert, Nonlinear Analysis TMA, 2003), which allows one to avoid cycling in degenerate iterations. We give here a complete presentation of the algorithm, with details of implementation in particular of the nonlinear projection. This has led to the software PIGAMES and allowed us to present numerical results on pursuit-evasion games.
引用
收藏
页码:1834 / 1841
页数:8
相关论文
共 50 条
  • [1] A policy iteration algorithm for zero-sum stochastic games with mean payoff
    Cochet-Terrasson, Jean
    Gaubert, Stephane
    [J]. COMPTES RENDUS MATHEMATIQUE, 2006, 343 (05) : 377 - 382
  • [2] Solving Common-Payoff Games with Approximate Policy Iteration
    Sokota, Samuel
    Lockhart, Edward
    Timbers, Finbarr
    Davoodi, Elnaz
    D'Orazio, Ryan
    Burch, Neil
    Schmid, Martin
    Bowling, Michael
    Lanctot, Marc
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9695 - 9703
  • [3] Universal complexity bounds based on value iteration for stochastic mean payoff games and entropy games
    Allamigeon, Xavier
    Gaubert, Stéphane
    Katz, Ricardo D.
    Skomra, Mateusz
    [J]. Information and Computation, 2025, 302
  • [4] Simple Stochastic Games, Mean Payoff Games, Parity Games
    Zwick, Uri
    [J]. COMPUTER SCIENCE - THEORY AND APPLICATIONS, 2008, 5010 : 29 - 29
  • [5] Solving Mean-Payoff Games on the GPU
    Meyer, Philipp J.
    Luttenberger, Michael
    [J]. AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, ATVA 2016, 2016, 9938 : 262 - 267
  • [6] Stochastic Window Mean-Payoff Games
    Doyen, Laurent
    Gaba, Pranshu
    Guha, Shibashis
    [J]. FOUNDATIONS OF SOFTWARE SCIENCE AND COMPUTATION STRUCTURES, PT I, FOSSACS 2024, 2024, 14574 : 34 - 54
  • [7] Strategy recovery for stochastic mean payoff games
    Mamino, Marcello
    [J]. THEORETICAL COMPUTER SCIENCE, 2017, 675 : 101 - 104
  • [8] Reduction of stochastic parity to stochastic mean-payoff games
    Chatterjee, Krishnendu
    Henzinger, Thomas A.
    [J]. INFORMATION PROCESSING LETTERS, 2008, 106 (01) : 1 - 7
  • [9] On Solving Mean Payoff Games Using Pivoting Algorithms
    Neogy, S. K.
    Mondal, Senjit
    Gupta, Abhijit
    Ghorui, Debasish
    [J]. ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2018, 35 (05)
  • [10] Solving Mean-Payoff Games via Quasi Dominions
    Benerecetti, Massimo
    Dell'Erba, Daniele
    Mogavero, Fabio
    [J]. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PT II, TACAS 2020, 2020, 12079 : 289 - 306