The Complexity of Markov Equilibrium in Stochastic Games

被引:0
|
作者
Daskalakis, Constantinos [1 ]
Golowich, Noah [1 ]
Zhang, Kaiqing [2 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Univ Maryland, College Pk, MD 20740 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum stochastic games is PPAD-hard, even when there are two players, the game is turn-based, the discount factor is an absolute constant, and the approximation is an absolute constant. Our intractability results stand in sharp contrast to the results in normal-form games, where exact CCEs are efficiently computable. A fortiori, our results imply that, in the setting of multi-agent reinforcement learning (MARL), it is computationally hard to learn stationary Markov CCE policies in stochastic games, even when the interaction is two-player and turn-based, and both the discount factor and the desired approximation of the learned policies is an absolute constant. In turn, these results stand in sharp contrast to single-agent reinforcement learning (RL) where near-optimal stationary Markov policies can be computationally efficiently learned. Complementing our intractability results for stationary Markov CCEs, we provide a decentralized algorithm (assuming shared randomness among players) for learning a nonstationary Markov CCE policy with polynomial time and sample complexity in all problem parameters. Previous work for learning Markov CCE policies all required exponential time and sample complexity in the number of players. In the balance, our work advocates for the use of nonstationary Markov CCE policies as a computationally and statistically tractable solution concept in MARL, advancing an important and outstanding frontier in machine learning.
引用
收藏
页数:55
相关论文
共 50 条
  • [31] Recursive Markov decision processes and recursive stochastic games
    Etessami, K
    Yannakakis, M
    [J]. AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 891 - 903
  • [32] Stationary Markov perfect equilibria in discounted stochastic games
    He, Wei
    Sun, Yeneng
    [J]. JOURNAL OF ECONOMIC THEORY, 2017, 169 : 35 - 61
  • [33] STOCHASTIC SCHEDULING GAMES WITH MARKOV DECISION ARRIVAL PROCESSES
    ALTMAN, E
    KOOLE, G
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1993, 26 (06) : 141 - 148
  • [34] Recursive Markov Decision Processes and Recursive Stochastic Games
    Etessami, Kousha
    Yannakakis, Mihalis
    [J]. JOURNAL OF THE ACM, 2015, 62 (02)
  • [35] ON MARKOV PERFECT EQUILIBRIA IN DISCOUNTED STOCHASTIC ARAT GAMES
    Jaskiewicz, Anna
    Nowak, Andrzej S.
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (04) : 2148 - 2175
  • [36] Complexity of equilibrium in competitive diffusion games on social networks
    Etesami, Seyed Rasoul
    Basar, Tamer
    [J]. AUTOMATICA, 2016, 68 : 100 - 110
  • [37] Fisher Information Determinant and Stochastic Complexity for Markov Models
    Takeuchi, Jun'ichi
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, VOLS 1- 4, 2009, : 1894 - 1898
  • [38] Markov Games with Decoupled Dynamics: Price of Anarchy and Sample Complexity
    Zhang, Runyu Cathy
    Zhang, Yuyang
    Konda, Rohit
    Ferguson, Bryce
    Marden, Jason
    Li, Na
    [J]. 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8100 - 8107
  • [39] Algebraic geometry and stochastic complexity of hidden Markov models
    Yamazaki, K
    Watanabe, S
    [J]. NEUROCOMPUTING, 2005, 69 (1-3) : 62 - 84
  • [40] The Complexity of Nash Equilibria in Simple Stochastic Multiplayer Games
    Ummels, Michael
    Wojtczak, Dominik
    [J]. AUTOMATA, LANGUAGES AND PROGRAMMING, PT II, PROCEEDINGS, 2009, 5556 : 297 - +