Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

被引：0

作者：

Sun, Youbang ^{[1
]}

Liu, Tao ^{[2
]}

Zhou, Ruida ^{[2
]}

Kumar, P. R. ^{[2
]}

Shahrampour, Shahin ^{[1
]}

机构：

[1] Northeastern Univ, Boston, MA 02115 USA

[2] Texas A&M Univ, College Stn, TX USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an epsilon-Nash Equilibrium (NE) within O(1/epsilon) iterations. This improves upon the previous best result of O(1/epsilon(2)) iterations and is of the same order, O(1/epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.

引用

页数：21

共 42 条

[31] Fast Convergence for Time-Varying Semi-Anonymous Potential Games
Borowski, Holly
Marden, Jason R.
2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 5384 - 5389
[32] Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games
Zhou, Shiyang
Ren, Weiya
Ren, Xiaoguang
Wang, Yanzhen
Yi, Xiaodong
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 625 - 637
[33] Fast Convergence of Optimistic Gradient Ascent in Network Zero-Sum Extensive Form Games
Piliouras, Georgios
Ratliff, Lillian
Sim, Ryann
Skoulakis, Stratis
ALGORITHMIC GAME THEORY, SAGT 2022, 2022, 13584 : 383 - 399
[34] CONVERGENCE OF ENTROPY-REGULARIZED NATURAL POLICY GRADIENT WITH LINEAR FUNCTION APPROXIMATION
Cayci, Semih
He, Niao
Srikant, R.
SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (03) : 2729 - 2755
[35] Convergence and optimality of policy gradient primal-dual method for constrained Markov decision processes
Ding, Dongsheng
Zhang, Kaiqing
Basar, Tamer
Jovanovic, Mihailo R.
2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2851 - 2856
[36] Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes
Ding, Dongsheng
Zhang, Kaiqing
Basar, Tamer
Jovanovic, Mihailo R.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[37] Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
Pattathil, Sarath
Zhang, Kaiqing
Ozdaglar, Asuman
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
[38] Global Convergence of Natural Policy Gradient with Hessian-Aided Momentum Variance Reduction
Feng, Jie
Wei, Ke
Chen, Jinchi
JOURNAL OF SCIENTIFIC COMPUTING, 2024, 101 (02)
[39] On the Convergence of Natural Policy Gradient and Mirror Descent-Like Policy Methods for Average-Reward MDPs
Murthy, Yashaswini
Srikant, R.
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1979 - 1984
[40] Understanding approximate Fisher information for fast convergence of natural gradient descent in wide neural networks*
Karakida, Ryo
Osawa, Kazuki
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2021, 2021 (12):

← 1 2 3 4 5 →