Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

被引：0

作者：

Farina, Gabriele ^{[1
]}

Kroer, Christian ^{[2
]}

Sandholm, Tuomas ^{[1
,3
,4
,5
]}

机构：

[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

[2] Columbia Univ, IEOR Dept, New York, NY 10027 USA

[3] Strategic Machine Inc, Morristown, NJ USA

[4] Strategy Robot Inc, Pittsburgh, PA USA

[5] Optimized Markets Inc, Pittsburgh, PA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the performance of optimistic regret-minimization algorithms for both minimizing regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to apply these algorithms to extensive-form games, a distance-generating function is needed. We study the use of the dilated entropy and dilated Euclidean distance functions. For the dilated Euclidean distance function we prove the first explicit bounds on the strong-convexity parameter for general treeplexes. Furthermore, we show that the use of dilated distance-generating functions enable us to decompose the mirror descent algorithm, and its optimistic variant, into local mirror descent algorithms at each information set. This decomposition mirrors the structure of the counterfactual regret minimization framework, and enables important techniques in practice, such as distributed updates and pruning of cold parts of the game tree. Our algorithms provably converge at a rate of T-1, which is superior to prior counterfactual regret minimization algorithms. We experimentally compare to the popular algorithm CFR+, which has a theoretical convergence rate of T-0.5 in theory, but is known to often converge at a rate of T-1, or better, in practice. We give an example matrix game where CFR+ experimentally converges at a relatively slow rate of T-0.74, whereas our optimistic methods converge faster than T-1. We go on to show that our fast rate also holds in the Kuhn poker game, which is an extensive-form game. For games with deeper game trees however, we find that CFR+ is still faster. Finally we show that when the goal is minimizing regret, rather than computing a Nash equilibrium, our optimistic methods can outperform CFR+, even in deep game trees.

引用

页数：11

共 14 条

[1] Stochastic Regret Minimization in Extensive-Form Games
Farina, Gabriele
Kroer, Christian
Sandholm, Tuomas
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[2] Stochastic Regret Minimization in Extensive-Form Games
Farina, Gabriele
Kroer, Christian
Sandholm, Tuomas
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[3] Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Bai, Yu
Jin, Chi
Mei, Song
Song, Ziang
Yu, Tiancheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[4] Regret-Based Pruning in Extensive-Form Games
Brown, Noam
Sandholm, Tuomas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[5] Faster Optimistic Online Mirror Descent for Extensive-Form Games
Jiang, Huacong
Liu, Weiming
Li, Bin
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 90 - 103
[6] Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium
Farina, Gabriele
Ling, Chun Kai
Fang, Fei
Sandholm, Tuomas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[7] Near-Optimal Φ-Regret Learning in Extensive-Form Games
Anagnostides, Ioannis
Farina, Gabriele
Sandholm, Tuomas
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 814 - 839
[8] Equilibrium Finding in Normal-Form Games via Greedy Regret Minimization
Zhang, Hugh
Lerer, Adam
Brown, Noam
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9484 - 9492
[9] Faster algorithms for extensive-form game solving via improved smoothing functions
Kroer, Christian
Waugh, Kevin
Kilinc-Karzan, Fatma
Sandholm, Tuomas
MATHEMATICAL PROGRAMMING, 2020, 179 (1-2) : 385 - 417
[10] Faster algorithms for extensive-form game solving via improved smoothing functions
Christian Kroer
Kevin Waugh
Fatma Kılınç-Karzan
Tuomas Sandholm
Mathematical Programming, 2020, 179 : 385 - 417

← 1 2 →