Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

被引:0
|
作者
Sun, Youbang [1 ]
Liu, Tao [2 ]
Zhou, Ruida [2 ]
Kumar, P. R. [2 ]
Shahrampour, Shahin [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Texas A&M Univ, College Stn, TX USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an epsilon-Nash Equilibrium (NE) within O(1/epsilon) iterations. This improves upon the previous best result of O(1/epsilon(2)) iterations and is of the same order, O(1/epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.
引用
收藏
页数:21
相关论文
共 42 条
  • [31] Fast Convergence for Time-Varying Semi-Anonymous Potential Games
    Borowski, Holly
    Marden, Jason R.
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 5384 - 5389
  • [32] Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games
    Zhou, Shiyang
    Ren, Weiya
    Ren, Xiaoguang
    Wang, Yanzhen
    Yi, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 625 - 637
  • [33] Fast Convergence of Optimistic Gradient Ascent in Network Zero-Sum Extensive Form Games
    Piliouras, Georgios
    Ratliff, Lillian
    Sim, Ryann
    Skoulakis, Stratis
    ALGORITHMIC GAME THEORY, SAGT 2022, 2022, 13584 : 383 - 399
  • [34] CONVERGENCE OF ENTROPY-REGULARIZED NATURAL POLICY GRADIENT WITH LINEAR FUNCTION APPROXIMATION
    Cayci, Semih
    He, Niao
    Srikant, R.
    SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (03) : 2729 - 2755
  • [35] Convergence and optimality of policy gradient primal-dual method for constrained Markov decision processes
    Ding, Dongsheng
    Zhang, Kaiqing
    Basar, Tamer
    Jovanovic, Mihailo R.
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2851 - 2856
  • [36] Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes
    Ding, Dongsheng
    Zhang, Kaiqing
    Basar, Tamer
    Jovanovic, Mihailo R.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [37] Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
    Pattathil, Sarath
    Zhang, Kaiqing
    Ozdaglar, Asuman
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [38] Global Convergence of Natural Policy Gradient with Hessian-Aided Momentum Variance Reduction
    Feng, Jie
    Wei, Ke
    Chen, Jinchi
    JOURNAL OF SCIENTIFIC COMPUTING, 2024, 101 (02)
  • [39] On the Convergence of Natural Policy Gradient and Mirror Descent-Like Policy Methods for Average-Reward MDPs
    Murthy, Yashaswini
    Srikant, R.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1979 - 1984
  • [40] Understanding approximate Fisher information for fast convergence of natural gradient descent in wide neural networks*
    Karakida, Ryo
    Osawa, Kazuki
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2021, 2021 (12):