Tight last-iterate convergence rates for no-regret learning in multi-player games

被引：0

作者：

Golowich, Noah ^{[1
]}

Pattathil, Sarath ^{[2
]}

Daskalakis, Constantinos ^{[1
]}

机构：

[1] MIT, CSAIL, Cambridge, MA 02139 USA

[2] MIT, EECS, Cambridge, MA 02139 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

DYNAMICS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of O(1/root T) with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who asked whether extra-gradient approaches (such as OG) can be applied to achieve improved guarantees in the multi-agent learning setting. The proof of our upper bound uses a new technique centered around an adaptive choice of potential function at each iteration. We also show that the O(1/root T) rate is tight for all p-SCLI algorithms, which includes OG as a special case. As a byproduct of our lower bound analysis we additionally present a proof of a conjecture of Arjevani et al. (2015) which is more direct than previous approaches.

引用

下载

页数：13

共 37 条

[31] Off-Policy Model-Free Learning for Multi-Player Non-Zero-Sum Games With Constrained Inputs
Huo, Yu
Wang, Ding
Qiao, Junfei
Li, Menghua
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (02) : 910 - 920
[32] Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator
Ren, He
Zhang, Huaguang
Wen, Yinlei
Liu, Chong
NEUROCOMPUTING, 2019, 335 : 96 - 104
[33] Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state
Li, Jinna
Xiao, Zhenfei
Fan, Jialu
Chai, Tianyou
Lewis, Frank L. L.
AUTOMATICA, 2022, 136
[34] Integral Reinforcement Learning-Based Optimal Control for Nonzero-Sum Games of Multi-Player Input-Constrained Nonlinear Systems
Wu, Qiuye
Zhao, Bo
Liu, Derong
2021 7TH INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE, ICRAI 2021, 2021, : 59 - 63
[35] Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
Jiang, He
Zhang, Huaguang
Xie, Xiangpeng
Han, Ji
NEUROCOMPUTING, 2019, 344 : 13 - 19
[36] Neural networks-based optimal tracking control for nonzero-sum games of multi-player continuous-time nonlinear systems via reinforcement learning
Zhao, Jingang
NEUROCOMPUTING, 2020, 412 : 167 - 176
[37] Extragradient-type methods with O1/k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\left( 1/k\right) $$\end{document} last-iterate convergence rates for co-hypomonotone inclusions
Quoc Tran-Dinh
Journal of Global Optimization, 2024, 89 (1) : 197 - 221

← 1 2 3 4 →