Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

被引:0
|
作者
Farina, Gabriele [1 ]
Kroer, Christian [2 ]
Sandholm, Tuomas [1 ,3 ,4 ,5 ]
机构
[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
[2] Columbia Univ, IEOR Dept, New York, NY 10027 USA
[3] Strategic Machine Inc, Morristown, NJ USA
[4] Strategy Robot Inc, Pittsburgh, PA USA
[5] Optimized Markets Inc, Pittsburgh, PA USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the performance of optimistic regret-minimization algorithms for both minimizing regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to apply these algorithms to extensive-form games, a distance-generating function is needed. We study the use of the dilated entropy and dilated Euclidean distance functions. For the dilated Euclidean distance function we prove the first explicit bounds on the strong-convexity parameter for general treeplexes. Furthermore, we show that the use of dilated distance-generating functions enable us to decompose the mirror descent algorithm, and its optimistic variant, into local mirror descent algorithms at each information set. This decomposition mirrors the structure of the counterfactual regret minimization framework, and enables important techniques in practice, such as distributed updates and pruning of cold parts of the game tree. Our algorithms provably converge at a rate of T-1, which is superior to prior counterfactual regret minimization algorithms. We experimentally compare to the popular algorithm CFR+, which has a theoretical convergence rate of T-0.5 in theory, but is known to often converge at a rate of T-1, or better, in practice. We give an example matrix game where CFR+ experimentally converges at a relatively slow rate of T-0.74, whereas our optimistic methods converge faster than T-1. We go on to show that our fast rate also holds in the Kuhn poker game, which is an extensive-form game. For games with deeper game trees however, we find that CFR+ is still faster. Finally we show that when the goal is minimizing regret, rather than computing a Nash equilibrium, our optimistic methods can outperform CFR+, even in deep game trees.
引用
收藏
页数:11
相关论文
共 14 条
  • [1] Stochastic Regret Minimization in Extensive-Form Games
    Farina, Gabriele
    Kroer, Christian
    Sandholm, Tuomas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [2] Stochastic Regret Minimization in Extensive-Form Games
    Farina, Gabriele
    Kroer, Christian
    Sandholm, Tuomas
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [3] Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
    Bai, Yu
    Jin, Chi
    Mei, Song
    Song, Ziang
    Yu, Tiancheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [4] Regret-Based Pruning in Extensive-Form Games
    Brown, Noam
    Sandholm, Tuomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [5] Faster Optimistic Online Mirror Descent for Extensive-Form Games
    Jiang, Huacong
    Liu, Weiming
    Li, Bin
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 90 - 103
  • [6] Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium
    Farina, Gabriele
    Ling, Chun Kai
    Fang, Fei
    Sandholm, Tuomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] Near-Optimal Φ-Regret Learning in Extensive-Form Games
    Anagnostides, Ioannis
    Farina, Gabriele
    Sandholm, Tuomas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 814 - 839
  • [8] Equilibrium Finding in Normal-Form Games via Greedy Regret Minimization
    Zhang, Hugh
    Lerer, Adam
    Brown, Noam
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9484 - 9492
  • [9] Faster algorithms for extensive-form game solving via improved smoothing functions
    Kroer, Christian
    Waugh, Kevin
    Kilinc-Karzan, Fatma
    Sandholm, Tuomas
    MATHEMATICAL PROGRAMMING, 2020, 179 (1-2) : 385 - 417
  • [10] Faster algorithms for extensive-form game solving via improved smoothing functions
    Christian Kroer
    Kevin Waugh
    Fatma Kılınç-Karzan
    Tuomas Sandholm
    Mathematical Programming, 2020, 179 : 385 - 417