Tight last-iterate convergence rates for no-regret learning in multi-player games

被引:0
|
作者
Golowich, Noah [1 ]
Pattathil, Sarath [2 ]
Daskalakis, Constantinos [1 ]
机构
[1] MIT, CSAIL, Cambridge, MA 02139 USA
[2] MIT, EECS, Cambridge, MA 02139 USA
关键词
DYNAMICS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of O(1/root T) with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who asked whether extra-gradient approaches (such as OG) can be applied to achieve improved guarantees in the multi-agent learning setting. The proof of our upper bound uses a new technique centered around an adaptive choice of potential function at each iteration. We also show that the O(1/root T) rate is tight for all p-SCLI algorithms, which includes OG as a special case. As a byproduct of our lower bound analysis we additionally present a proof of a conjecture of Arjevani et al. (2015) which is more direct than previous approaches.
引用
下载
收藏
页数:13
相关论文
共 37 条
  • [31] Off-Policy Model-Free Learning for Multi-Player Non-Zero-Sum Games With Constrained Inputs
    Huo, Yu
    Wang, Ding
    Qiao, Junfei
    Li, Menghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (02) : 910 - 920
  • [32] Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator
    Ren, He
    Zhang, Huaguang
    Wen, Yinlei
    Liu, Chong
    NEUROCOMPUTING, 2019, 335 : 96 - 104
  • [33] Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state
    Li, Jinna
    Xiao, Zhenfei
    Fan, Jialu
    Chai, Tianyou
    Lewis, Frank L. L.
    AUTOMATICA, 2022, 136
  • [34] Integral Reinforcement Learning-Based Optimal Control for Nonzero-Sum Games of Multi-Player Input-Constrained Nonlinear Systems
    Wu, Qiuye
    Zhao, Bo
    Liu, Derong
    2021 7TH INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE, ICRAI 2021, 2021, : 59 - 63
  • [35] Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
    Jiang, He
    Zhang, Huaguang
    Xie, Xiangpeng
    Han, Ji
    NEUROCOMPUTING, 2019, 344 : 13 - 19
  • [36] Neural networks-based optimal tracking control for nonzero-sum games of multi-player continuous-time nonlinear systems via reinforcement learning
    Zhao, Jingang
    NEUROCOMPUTING, 2020, 412 : 167 - 176