On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

被引:0
|
作者
Anagnostides, Ioannis [1 ]
Panageas, Ioannis [2 ]
Farina, Gabriele [3 ]
Sandholm, Tuomas [1 ,4 ,5 ,6 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Calif Irvine, Irvine, CA USA
[3] MIT, Cambridge, MA USA
[4] Strateg Machine Inc, Charlotte, NC USA
[5] Strategy Robot Inc, Pittsburgh, PA USA
[6] Optimized Markets Inc, Pittsburgh, PA USA
基金
美国国家科学基金会;
关键词
POKER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multi-agent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also extend to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.
引用
收藏
页数:39
相关论文
共 50 条
  • [31] Convergence Analysis of the Best Response Algorithm for Time-Varying Games
    Wang, Zifan
    Shen, Yi
    Zavlanos, Michael M.
    Johansson, Karl H.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1144 - 1149
  • [32] No-regret learning for repeated non-cooperative games with lossy bandits
    Liu, Wenting
    Lei, Jinlong
    Yi, Peng
    Hong, Yiguang
    AUTOMATICA, 2024, 160
  • [33] On the Complexity of Computing Sparse Equilibria and Lower Bounds for No-Regret Learning in Games
    Anagnostides, Ioannis
    Kalavasis, Alkis
    Sandholm, Tuomas
    Zampetakis, Manolis
    15TH INNOVATIONS IN THEORETICAL COMPUTER SCIENCE CONFERENCE, ITCS 2024, 2024,
  • [34] DYNAMICS OF TIME-VARYING THRESHOLD LEARNING
    SKLANSKY, J
    BERSHAD, NJ
    INFORMATION AND CONTROL, 1969, 15 (06): : 455 - &
  • [35] DYNAMICS OF TIME-VARYING THRESHOLD LEARNING
    SKLANSKY, J
    BERSHAD, NJ
    IEEE TRANSACTIONS ON INFORMATION THEORY, 1970, 16 (01) : 127 - +
  • [36] On the Convergence of Regret Minimization Dynamics in Concave Games
    Dar, Eyal Even
    Mansour, Yishay
    Nadav, Uri
    STOC'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2009, : 523 - 532
  • [37] No-regret dynamics and fictitious play
    Viossat, Yannick
    Zapechelnyuk, Andriy
    JOURNAL OF ECONOMIC THEORY, 2013, 148 (02) : 825 - 842
  • [38] No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
    Celli, Andrea
    Marchesi, Alberto
    Farina, Gabriele
    Gatti, Nicola
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [39] Fast Convergence for Time-Varying Semi-Anonymous Potential Games
    Borowski, Holly
    Marden, Jason R.
    2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 5384 - 5389
  • [40] Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback
    Ba, Wenjia
    Lin, Tianyi
    Zhang, Jiawei
    Zhou, Zhengyuan
    OPERATIONS RESEARCH, 2025,