Near-Optimal No-Regret Learning Dynamics for General Convex Games

被引:0
|
作者
Farina, Gabriele [1 ]
Anagnostides, Ioannis [1 ]
Luo, Haipeng [2 ]
Lee, Chung-Wei [2 ]
Kroer, Christian [3 ]
Sandholm, Tuomas [1 ,4 ,5 ,6 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
[3] Columbia Univ, New York, NY 10027 USA
[4] Strategy Robot Inc, Pittsburgh, PA 15213 USA
[5] Optimized Markets Inc, Pittsburgh, PA 15213 USA
[6] Strateg Machine Inc, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
CORRELATED EQUILIBRIA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of work has established uncoupled learning dynamics such that, when employed by all players in a game, each player's regret after T repetitions grows polylogarithmically in T, an exponential improvement over the traditional guarantees within the no-regret framework. However, so far these results have only been limited to certain classes of games with structured strategy spaces-such as normal-form and extensive-form games. The question as to whether O(polylog T) regret bounds can be obtained for general convex and compact strategy sets-which occur in many fundamental models in economics and multiagent systems-while retaining efficient strategy updates is an important question. In this paper, we answer this in the positive by establishing the first uncoupled learning algorithm with O(log T) per-player regret in general convex games, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets. Our learning dynamics are based on an instantiation of optimistic follow-the-regularized-leader over an appropriately lifted space using a self-concordant regularizer that is peculiarly not a barrier for the feasible region. Our learning dynamics are efficiently implementable given access to a proximal oracle for the convex strategy set, leading to O(log log T) per-iteration complexity; we also give extensions when access to only a linear optimization oracle is assumed. Finally, we adapt our dynamics to guarantee O(root T) regret in the adversarial regime. Even in those special cases where prior results apply, our algorithm improves over the state-of-the-art regret bounds either in terms of the dependence on the number of iterations or on the dimension of the strategy sets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Near-Optimal No-Regret Learning in General Games
    Daskalakis, Constantinos
    Fishelson, Maxwell
    Golowich, Noah
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-player General-Sum Games
    Anagnostides, Ioannis
    Daskalakis, Constantinos
    Farina, Gabriele
    Fishelson, Maxwell
    Golowich, Noah
    Sandholm, Tuomas
    [J]. PROCEEDINGS OF THE 54TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '22), 2022, : 736 - 749
  • [3] Near-optimal no-regret algorithms for zero-sum games
    Daskalakis, Constantinos
    Deckelbaum, Alan
    Kim, Anthony
    [J]. GAMES AND ECONOMIC BEHAVIOR, 2015, 92 : 327 - 348
  • [4] Near-Optimal No-Regret Algorithms for Zero-Sum Games
    Daskalakis, Constantinos
    Deckelbaum, Alan
    Kim, Anthony
    [J]. PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2011, : 235 - 254
  • [5] Doubly Optimal No-Regret Learning in Monotone Games
    Cai, Yang
    Zheng, Weiqiang
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [6] Near-Optimal Φ-Regret Learning in Extensive-Form Games
    Anagnostides, Ioannis
    Farina, Gabriele
    Sandholm, Tuomas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 814 - 839
  • [7] Risk-Averse No-Regret Learning in Online Convex Games
    Wang, Zifan
    Shen, Yi
    Zavlanos, Michael M.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] No-regret algorithms in on-line learning, games and convex optimization
    Sorin, Sylvain
    [J]. MATHEMATICAL PROGRAMMING, 2024, 203 (1-2) : 645 - 686
  • [9] No-regret algorithms in on-line learning, games and convex optimization
    Sylvain Sorin
    [J]. Mathematical Programming, 2024, 203 : 645 - 686
  • [10] No-Regret Learning in Bayesian Games
    Hartline, Jason
    Syrgkanis, Vasilis
    Tardos, Eva
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28