On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

被引:0
|
作者
Zhang, Runyu [1 ]
Mei, Jincheng [2 ]
Dai, Bo [2 ]
Schuurmans, Dale [2 ,3 ]
Li, Na [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
[2] Google Res, Brain Team, Mountain View, CA USA
[3] Univ Alberta, Edmonton, AB, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Softmax policy gradient is a popular algorithm for policy optimization in single-agent reinforcement learning, particularly since projection is not needed for each gradient update. However, in multi-agent systems, the lack of central coordination introduces significant additional difficulties in the convergence analysis. Even for a stochastic game with identical interest, there can be multiple Nash Equilibria (NEs), which disables proof techniques that rely on the existence of a unique global optimum. Moreover, the softmax parameterization introduces non-NE policies with zero gradient, making it difficult for gradient-based algorithms in seeking NEs. In this paper, we study the finite time convergence of decentralized softmax gradient play in a special form of game, Markov Potential Games (MPGs), which includes the identical interest game as a special case. We investigate both gradient play and natural gradient play, with and without log-barrier regularization. The established convergence rates for the unregularized cases contain a trajectory dependent constant that can be arbitrarily large, whereas the log-barrier regularization overcomes this drawback, with the cost of slightly worse dependence on other factors such as the action set size. An empirical study on an identical interest matrix game confirms the theoretical findings.
引用
收藏
页数:13
相关论文
共 27 条
  • [1] Policy Gradient Play with Networked Agents in Markov Potential Games
    Aydin, Sarper
    Eksin, Ceyhun
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [2] On the convergence of distributed projected gradient play with heterogeneous learning rates in monotone games
    Tan, Shaolin
    Tao, Ye
    Ran, Maopeng
    Liu, Hao
    SYSTEMS & CONTROL LETTERS, 2023, 182
  • [3] Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games
    Sun, Youbang
    Liu, Tao
    Zhou, Ruida
    Kumar, P. R.
    Shahrampour, Shahin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
    Zhou, Zhaoyi
    Chen, Zaiwei
    Lin, Yiheng
    Wierman, Adam
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2563 - 2573
  • [5] Policy Gradient Play over Time-Varying Networks in Markov Potential Games
    Aydin, Sarper
    Eksin, Ceyhun
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1997 - 2002
  • [6] Decentralized Proximal Gradient Algorithms With Linear Convergence Rates
    Alghunaim, Sulaiman A.
    Ryu, Ernest K.
    Yuan, Kun
    Sayed, Ali H.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (06) : 2787 - 2794
  • [7] Convergence Bounds of Decentralized Fictitious Play Around a Single Nash Equilibrium in Near-Potential Games
    Aydin, Sarper
    Arefizadeh, Sina
    Eksin, Ceyhun
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2519 - 2524
  • [8] Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
    Ding, Dongsheng
    Wei, Chen-Yu
    Zhang, Kaiqing
    Jovanovic, Mihailo R.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] On the Exponential Rate of Convergence of Fictitious Play in Potential Games
    Swenson, Brian
    Kar, Soummya
    2017 55TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2017, : 275 - 279
  • [10] Gradient Play in Stochastic Games: Stationary Points, Convergence, and Sample Complexity
    Zhang, Runyu
    Ren, Zhaolin
    Li, Na
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (10) : 6499 - 6514