Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

被引:0
|
作者
Sun, Youbang [1 ]
Liu, Tao [2 ]
Zhou, Ruida [2 ]
Kumar, P. R. [2 ]
Shahrampour, Shahin [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Texas A&M Univ, College Stn, TX USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an epsilon-Nash Equilibrium (NE) within O(1/epsilon) iterations. This improves upon the previous best result of O(1/epsilon(2)) iterations and is of the same order, O(1/epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.
引用
收藏
页数:21
相关论文
共 42 条
  • [21] Expected Policy Gradient for Network Aggregative Markov Games in Continuous Space
    Moghaddam, Alireza Ramezani
    Kebriaei, Hamed
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 10
  • [22] On the convergence of policy gradient methods to Nash equilibria in general stochastic games
    Giannou, Angeliki
    Lotidis, Kyriakos
    Mertikopoulos, Panayotis
    Vlatakis-Gkaragkounis, Emmanouil V.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [23] Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
    Qiu, Shuang
    Wei, Xiaohan
    Ye, Jieping
    Wang, Zhaoran
    Yang, Zhuoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [24] Convergence to Nash Equilibrium and No-regret Guarantee in (Markov) Potential Games
    Dong, Jing
    Wang, Baoxiang
    Yu, Yaoliang
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [25] Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
    Zhang, Guodong
    Martens, James
    Grosse, Roger
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [26] Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
    Zhou, Zhaoyi
    Chen, Zaiwei
    Lin, Yiheng
    Wierman, Adam
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2563 - 2573
  • [27] Policy Gradient Algorithm in Two-Player Zero-Sum Markov Games
    Li Y.
    Zhou J.
    Feng Y.
    Feng Y.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (01): : 81 - 91
  • [28] On linear and super-linear convergence of Natural Policy Gradient algorithm
    Khodadadian, Sajad
    Jhunjhunwala, Prakirt Raj
    Varma, Sushil Mahavir
    Maguluri, Siva Theja
    SYSTEMS & CONTROL LETTERS, 2022, 164
  • [29] Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games
    Chen, Yan
    Li, Tao
    IFAC PAPERSONLINE, 2023, 56 (02): : 3435 - 3440
  • [30] An off-policy natural policy gradient method for a partial observable Markov decision process
    Nakamura, Y
    Mori, T
    Ishii, S
    ARTIFICIAL NEURAL NETWORKS: FORMAL MODELS AND THEIR APPLICATIONS - ICANN 2005, PT 2, PROCEEDINGS, 2005, 3697 : 431 - 436