Efficient Learning in Polyhedral Games via Best-Response Oracles

被引:0
|
作者
Chakrabarti, Darshan [1 ]
Farina, Gabriele [2 ]
Kroer, Christian [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] MIT, Cambridge, MA 02139 USA
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9 | 2024年
基金
美国国家科学基金会;
关键词
COMPUTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study online learning and equilibrium computation in games with polyhedral decision sets, a property shared by normal-form games (NFGs) and extensive-form games (EFGs), when the learning agent is restricted to utilizing a best-response oracle. We show how to achieve constant regret in zero-sum games and O(T-1/4) regret in general-sum games while using only O(log t) best-response queries at a given iteration t, thus improving over the best prior result, which required O(T) queries per iteration. Moreover, our framework yields the first last-iterate convergence guarantees for self-play with best-response oracles in zero-sum games. This convergence occurs at a linear rate, though with a condition-number dependence. We go on to show a O(1/root T) best-iterate convergence rate without such a dependence. Our results build on linear-rate convergence results for variants of the Frank-Wolfe (FW) algorithm for strongly convex and smooth minimization problems over polyhedral domains. These FW results depend on a condition number of the polytope, known as facial distance. In order to enable application to settings such as EFGs, we show two broad new results: 1) the facial distance for polytopes of the form {x is an element of R->= 0(n) vertical bar Ax = b} is at least gamma/root k where. is the minimum value of a nonzero coordinate of a vertex in the polytope and k <= n is the number of tight inequality constraints in the optimal face, and 2) the facial distance for polytopes of the form Ax = b, Cx <= d, x >= 0 where x is an element of R-n, C >= 0 is a nonzero integral matrix, and d >= 0, is at least 1/(vertical bar vertical bar C vertical bar vertical bar(infinity)root n). This yields the first such results for several problems, such as sequence-form polytopes, flow polytopes, and matching polytopes.
引用
收藏
页码:9564 / 9572
页数:9
相关论文
共 50 条
  • [31] Bounded best-response and collective-optimality reasoning in coordination games
    Faillo, Marco
    Smerilli, Alessandra
    Sugden, Robert
    JOURNAL OF ECONOMIC BEHAVIOR & ORGANIZATION, 2017, 140 : 317 - 335
  • [32] Asynchronous Best-Response Dynamics for Resource Allocation Games in Cloud Computing
    Schubert, Kevin
    Master, Neal
    Zhou, Zhengyuan
    Bambos, Nicholas
    2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 4613 - 4618
  • [33] Learning by replicator and best-response: the importance of being indifferent
    Sofia B. S. D. Castro
    Journal of Evolutionary Economics, 2018, 28 : 985 - 999
  • [34] Best-response dynamics, playing sequences, and convergence to equilibrium in random games
    Heinrich, Torsten
    Jang, Yoojin
    Mungo, Luca
    Pangallo, Marco
    Scott, Alex
    Tarbush, Bassel
    Wiese, Samuel
    INTERNATIONAL JOURNAL OF GAME THEORY, 2023, 52 (03) : 703 - 735
  • [35] Learning by replicator and best-response: the importance of being indifferent
    Castro, Sofia B. S. D.
    JOURNAL OF EVOLUTIONARY ECONOMICS, 2018, 28 (04) : 985 - 999
  • [36] Best-response dynamics in two-person random games with correlated payoffs
    Mimun, Hlafo Alfie
    Quattropani, Matteo
    Scarsini, Marco
    GAMES AND ECONOMIC BEHAVIOR, 2024, 145 : 239 - 262
  • [37] Evolutionary prisoner’s dilemma games with local interaction and best-response dynamics
    Yunshyong Chow
    Frontiers of Mathematics in China, 2015, 10 : 839 - 856
  • [38] Evolutionary prisoner's dilemma games with local interaction and best-response dynamics
    Chow, Yunshyong
    FRONTIERS OF MATHEMATICS IN CHINA, 2015, 10 (04) : 839 - 856
  • [39] A Randomized Inexact Proximal Best-response Scheme for Potential Stochastic Nash Games
    Lei, Jinlong
    Shanbhag, Uday V.
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [40] The Efficiency of Best-Response Dynamics
    Feldman, Michal
    Snappir, Yuval
    Tamir, Tami
    ALGORITHMIC GAME THEORY (SAGT 2017), 2017, 10504 : 186 - 198