On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

被引:0
|
作者
Anagnostides, Ioannis [1 ]
Panageas, Ioannis [2 ]
Farina, Gabriele [3 ]
Sandholm, Tuomas [1 ,4 ,5 ,6 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Calif Irvine, Irvine, CA USA
[3] MIT, Cambridge, MA USA
[4] Strateg Machine Inc, Charlotte, NC USA
[5] Strategy Robot Inc, Pittsburgh, PA USA
[6] Optimized Markets Inc, Pittsburgh, PA USA
基金
美国国家科学基金会;
关键词
POKER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multi-agent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also extend to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.
引用
收藏
页数:39
相关论文
共 50 条
  • [41] No-regret Exploration in Contextual Reinforcement Learning
    Modi, Aditya
    Tewari, Ambuj
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 829 - 838
  • [42] Asynchronous and Time-Varying Proximal Type Dynamics in Multiagent Network Games
    Cenedese, Carlo
    Belgioioso, Giuseppe
    Kawano, Yu
    Grammatico, Sergio
    Cao, Ming
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (06) : 2861 - 2867
  • [43] Minimal regret state estimation of time-varying systems
    Brouillon, Jean-Sebastien
    Dorfler, Florian
    Trecate, Giancarlo Ferrari
    [J]. IFAC PAPERSONLINE, 2023, 56 (02): : 2595 - 2600
  • [44] A Reduction from Reinforcement Learning to No-Regret Online Learning
    Cheng, Ching-An
    des Combes, Remi Tachet
    Boots, Byron
    Gordon, Geoff
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3514 - 3523
  • [45] ON THE CONVERGENCE PROPERTIES OF A TIME-VARYING RECURSION
    STOICA, P
    SODERSTROM, T
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1995, 2 (05) : 95 - 96
  • [46] On the convergence properties of a time-varying recursion
    [J]. Stoica, Petre, 1600, IEEE, Piscataway (02):
  • [47] No-Regret and Incentive-Compatible Online Learning
    Freeman, Rupert
    Pennock, David M.
    Podimata, Chara
    Vaughan, Jennifer Wortman
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [48] A Marriage between Adversarial Team Games and 2-player Games: Enabling Abstractions, No-regret Learning, and Subgame Solving
    Carminati, Luca
    Cacciamani, Federico
    Ciccone, Marco
    Gatti, Nicola
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [49] No-regret Online Learning over Riemannian Manifolds
    Wang, Xi
    Tu, Zhipeng
    Hong, Yiguang
    Wu, Yingyi
    Shi, Guodong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] No-Regret Learning in Partially-Informed Auctions
    Guo, Wenshuo
    Jordan, Michael I.
    Vitercik, Ellen
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,