Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

被引：0

作者：

Sun, Youbang ^{[1
]}

Liu, Tao ^{[2
]}

Zhou, Ruida ^{[2
]}

Kumar, P. R. ^{[2
]}

Shahrampour, Shahin ^{[1
]}

机构：

[1] Northeastern Univ, Boston, MA 02115 USA

[2] Texas A&M Univ, College Stn, TX USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the suboptimality gap, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an epsilon-Nash Equilibrium (NE) within O(1/epsilon) iterations. This improves upon the previous best result of O(1/epsilon(2)) iterations and is of the same order, O(1/epsilon), that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.

引用

页数：21

共 42 条

[21] Expected Policy Gradient for Network Aggregative Markov Games in Continuous Space
Moghaddam, Alireza Ramezani
Kebriaei, Hamed
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 10
[22] On the convergence of policy gradient methods to Nash equilibria in general stochastic games
Giannou, Angeliki
Lotidis, Kyriakos
Mertikopoulos, Panayotis
Vlatakis-Gkaragkounis, Emmanouil V.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[23] Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Qiu, Shuang
Wei, Xiaohan
Ye, Jieping
Wang, Zhaoran
Yang, Zhuoran
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[24] Convergence to Nash Equilibrium and No-regret Guarantee in (Markov) Potential Games
Dong, Jing
Wang, Baoxiang
Yu, Yaoliang
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[25] Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Zhang, Guodong
Martens, James
Grosse, Roger
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[26] Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
Zhou, Zhaoyi
Chen, Zaiwei
Lin, Yiheng
Wierman, Adam
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2563 - 2573
[27] Policy Gradient Algorithm in Two-Player Zero-Sum Markov Games
Li Y.
Zhou J.
Feng Y.
Feng Y.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (01): : 81 - 91
[28] On linear and super-linear convergence of Natural Policy Gradient algorithm
Khodadadian, Sajad
Jhunjhunwala, Prakirt Raj
Varma, Sushil Mahavir
Maguluri, Siva Theja
SYSTEMS & CONTROL LETTERS, 2022, 164
[29] Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games
Chen, Yan
Li, Tao
IFAC PAPERSONLINE, 2023, 56 (02): : 3435 - 3440
[30] An off-policy natural policy gradient method for a partial observable Markov decision process
Nakamura, Y
Mori, T
Ishii, S
ARTIFICIAL NEURAL NETWORKS: FORMAL MODELS AND THEIR APPLICATIONS - ICANN 2005, PT 2, PROCEEDINGS, 2005, 3697 : 431 - 436

← 1 2 3 4 5 →