Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model

被引:0
|
作者
Li, Gen [1 ]
Chi, Yuejie [2 ]
Wei, Yuting [1 ]
Chen, Yuxin [1 ]
机构
[1] UPenn, Philadelphia, PA 19104 USA
[2] CMU, Pittsburgh, PA USA
关键词
COMPLEXITY; BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies multi-agent reinforcement learning in Markov games, with the goal of learning Nash equilibria or coarse correlated equilibria (CCE) sample-optimally. All prior results suffer from at least one of the two obstacles: the curse of multiple agents and the barrier of long horizon, regardless of the sampling protocol in use. We take a step towards settling this problem, assuming access to a flexible sampling mechanism: the generative model. Focusing on non-stationary finite-horizon Markov games, we develop a fast learning algorithm called Q-FTRL and an adaptive sampling scheme that leverage the optimism principle in online adversarial learning (particularly the Follow-the-Regularized-Leader (FTRL) method). Our algorithm learns an epsilon-approximate CCE in a general-sum Markov game using (O) over tilde ((HS)-S-4 Sigma(m)(i=1) A(i)/epsilon(2)) samples, where m is the number of players, S indicates the number of states, H is the horizon, and A(i) denotes the number of actions for the i-th player. This is minimax-optimal (up to log factor) when m is fixed. When applied to two-player zero-sum Markov games, our algorithm provably finds an epsilon-approximate Nash equilibrium with a minimal number of samples. Along the way, we derive a refined regret bound for FTRL that makes explicit the role of variance-type quantities, which might be of independent interest.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Generative Attention Networks for Multi-Agent Behavioral Modeling
    Li, Guangyu
    Jiang, Bo
    Zhu, Hao
    Che, Zhengping
    Liu, Yan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7195 - 7202
  • [42] Optimal convergence in multi-agent MDPs
    Vrancx, Peter
    Verbeeck, Katja
    Nowe, Ann
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS: KES 2007 - WIRN 2007, PT III, PROCEEDINGS, 2007, 4694 : 107 - +
  • [43] Optimal Decentralization of Multi-Agent Motions
    Twu, Philip
    Egerstedt, Magnus
    2010 AMERICAN CONTROL CONFERENCE, 2010, : 2326 - 2331
  • [44] A Generative Simulation Platform for Multi-agent Systems with Incentives
    Wu, Zhengwei
    Zhang, Xiaoxi
    Xu, Susu
    Chen, Xinlei
    Zhang, Pei
    Noh, Hae Young
    Joe-Wong, Carlee
    UBICOMP/ISWC '20 ADJUNCT: PROCEEDINGS OF THE 2020 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2020 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, 2020, : 580 - 587
  • [45] Multi-agent activity recognition using observation decomposed hidden Markov model
    Liu, XH
    Chua, CS
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2003, 2626 : 247 - 256
  • [46] Robust multi-agent differential games for general linear systems with model uncertainties
    Liu, Fei
    Dong, Xiwang
    Li, Qingdong
    Ren, Zhang
    IFAC PAPERSONLINE, 2020, 53 (02): : 6691 - 6696
  • [47] Multi-agent reinforcement learning for Markov routing games: A new modeling paradigm for dynamic traffic assignment
    Shou, Zhenyu
    Chen, Xu
    Fu, Yongjie
    Di, Xuan
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2022, 137
  • [48] Multi-Agent Reinforcement Learning with Multi-Step Generative Models
    Krupnik, Orr
    Mordatch, Igor
    Tamar, Aviv
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [49] Multi-agent Financial Systems with RL: A Pension Ecosystem Case
    Ozhamaratli, Fatih
    Barucca, Paolo
    MULTI-AGENT-BASED SIMULATION XXIV, MABS 2023, 2024, 14558 : 58 - 79
  • [50] Fully Distributed Multi-Agent RL Framework for QoS Routing
    Quang Huy Duong
    Janulewicz, Emil
    Jaumard, Brigitte
    Bentaleb, Abdelhak
    Slobodrian, Sergio
    2023 IEEE FUTURE NETWORKS WORLD FORUM, FNWF, 2024,