Efficient Learning in Polyhedral Games via Best-Response Oracles

被引：0

作者：

Chakrabarti, Darshan ^{[1
]}

Farina, Gabriele ^{[2
]}

Kroer, Christian ^{[1
]}

机构：

[1] Columbia Univ, New York, NY 10027 USA

[2] MIT, Cambridge, MA 02139 USA

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9 | 2024年

基金：

美国国家科学基金会;

关键词：

COMPUTATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study online learning and equilibrium computation in games with polyhedral decision sets, a property shared by normal-form games (NFGs) and extensive-form games (EFGs), when the learning agent is restricted to utilizing a best-response oracle. We show how to achieve constant regret in zero-sum games and O(T-1/4) regret in general-sum games while using only O(log t) best-response queries at a given iteration t, thus improving over the best prior result, which required O(T) queries per iteration. Moreover, our framework yields the first last-iterate convergence guarantees for self-play with best-response oracles in zero-sum games. This convergence occurs at a linear rate, though with a condition-number dependence. We go on to show a O(1/root T) best-iterate convergence rate without such a dependence. Our results build on linear-rate convergence results for variants of the Frank-Wolfe (FW) algorithm for strongly convex and smooth minimization problems over polyhedral domains. These FW results depend on a condition number of the polytope, known as facial distance. In order to enable application to settings such as EFGs, we show two broad new results: 1) the facial distance for polytopes of the form {x is an element of R->= 0(n) vertical bar Ax = b} is at least gamma/root k where. is the minimum value of a nonzero coordinate of a vertex in the polytope and k <= n is the number of tight inequality constraints in the optimal face, and 2) the facial distance for polytopes of the form Ax = b, Cx <= d, x >= 0 where x is an element of R-n, C >= 0 is a nonzero integral matrix, and d >= 0, is at least 1/(vertical bar vertical bar C vertical bar vertical bar(infinity)root n). This yields the first such results for several problems, such as sequence-form polytopes, flow polytopes, and matching polytopes.

引用

页码：9564 / 9572

页数：9

共 50 条

[21] Structural Interventions in Linear Best-Response Games on Random Graphs
Petrov, Ilya
IFAC PAPERSONLINE, 2023, 56 (02): : 2830 - 2833
[22] Pure Nash Equilibria and Best-Response Dynamics in Random Games
Amiet, Ben
Collevecchio, Andrea
Scarsini, Marco
Zhong, Ziwen
MATHEMATICS OF OPERATIONS RESEARCH, 2021, 46 (04) : 1552 - 1572
[23] The Speed of Convergence in Congestion Games under Best-Response Dynamics
Fanelli, Angelo
Flammini, Michele
Moscardelli, Luca
ACM TRANSACTIONS ON ALGORITHMS, 2012, 8 (03)
[24] Strategic best-response learning in multiagent systems
Banerjee, Bikramjit
Peng, Jing
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2012, 24 (02) : 139 - 160
[25] Convergence of best-response dynamics in games with conflicting congestion effects
Feldman, Michal
Tamir, Tami
INFORMATION PROCESSING LETTERS, 2015, 115 (02) : 112 - 118
[26] The speed of convergence in congestion games under best-response dynamics
Fanelli, Angelo
Flammini, Michele
Moscardelli, Luca
AUTOMATA, LANGUAGES AND PROGRAMMING, PT 1, PROCEEDINGS, 2008, 5125 : 796 - 807
[27] Convergence of best-response dynamics in extensive-form games
Xu, Zibo
JOURNAL OF ECONOMIC THEORY, 2016, 162 : 21 - 54
[28] BEST-RESPONSE DYNAMICS IN A BIRTH-DEATH MODEL OF EVOLUTION IN GAMES
Alos-Ferrer, Carlos
Neustadt, Ilja
INTERNATIONAL GAME THEORY REVIEW, 2010, 12 (02) : 197 - 204
[29] Best-response dynamics, playing sequences, and convergence to equilibrium in random games
Torsten Heinrich
Yoojin Jang
Luca Mungo
Marco Pangallo
Alex Scott
Bassel Tarbush
Samuel Wiese
International Journal of Game Theory, 2023, 52 : 703 - 735
[30] On Synchronous, Asynchronous, and Randomized Best-Response Schemes for Stochastic Nash Games
Lei, Jinlong
Shanbhag, Uday, V
Pang, Jong-Shi
Sen, Suvrajeet
MATHEMATICS OF OPERATIONS RESEARCH, 2020, 45 (01) : 157 - 190

← 1 2 3 4 5 →