Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

被引:0
|
作者
Sun, Youbang [1 ]
Liu, Tao [2 ]
Zhou, Ruida [2 ]
Kumar, P. R. [2 ]
Shahrampour, Shahin [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Texas A&M Univ, College Stn, TX USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an epsilon-Nash Equilibrium (NE) within O(1/epsilon) iterations. This improves upon the previous best result of O(1/epsilon(2)) iterations and is of the same order, O(1/epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.
引用
收藏
页数:21
相关论文
共 42 条
  • [1] Independent Natural Policy Gradient Always Converges in Markov Potential Games
    Fox, Roy
    McAleer, Stephen
    Overman, William
    Panageas, Ioannis
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [2] Linear Convergence of Independent Natural Policy Gradient in Games With Entropy Regularization
    Sun, Youbang
    Liu, Tao
    Kumar, P. R.
    Shahrampour, Shahin
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 1217 - 1222
  • [3] Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization
    Cen, Shicong
    Chen, Fan
    Chi, Yuejie
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2833 - 2838
  • [4] Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
    Ding, Dongsheng
    Wei, Chen-Yu
    Zhang, Kaiqing
    Jovanovic, Mihailo R.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [5] Policy Gradient Play with Networked Agents in Markov Potential Games
    Aydin, Sarper
    Eksin, Ceyhun
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [6] On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games
    Zhang, Runyu
    Mei, Jincheng
    Dai, Bo
    Schuurmans, Dale
    Li, Na
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization
    Cen, Shicong
    Cheng, Chen
    Chen, Yuxin
    Wei, Yuting
    Chi, Yuejie
    OPERATIONS RESEARCH, 2021, 70 (04) : 2563 - 2578
  • [8] Policy gradient method for team Markov games
    Könönen, V
    INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 733 - 739
  • [9] Provable Policy Gradient Methods for Average-Reward Markov Potential Games
    Cheng, Min
    Zhou, Ruida
    Kumar, P. R.
    Tian, Chao
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [10] Fast Convergence in Semianonymous Potential Games
    Borowski, Holly
    Marden, Jason R.
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2017, 4 (02): : 246 - 258