Value functions for depth-limited solving in zero-sum imperfect-information games

被引:3
|
作者
Kovarik, Vojtech [1 ]
Seitz, Dominik [1 ]
Lisy, Viliam [1 ]
Rudolf, Jan [1 ]
Sun, Shuo [1 ]
Ha, Karel [1 ]
机构
[1] Czech Tech Univ, Artificial Intelligence Ctr, FEE, Prague, Czech Republic
关键词
Imperfect information game; Multiagent reinforcement learning; Extensive form game; Partially observable stochastic game; Depth limited game; Depth limited solving; Value function; Counterfactual regret minimization;
D O I
10.1016/j.artint.2022.103805
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously miss-ing in imperfect-information games. The definition works for an arbitrary (perfect recall) extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games is value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual re-gret minimization. In three distinct domains, we show that the algorithm's exploitability is roughly linearly dependent on the value network's quality and that it is not difficult to train a value network with which depth-limited CFR's performance is as good as that of CFR with access to the full game.(c) 2022 Published by Elsevier B.V.
引用
收藏
页数:51
相关论文
共 50 条
  • [1] Depth-Limited Solving for Imperfect-Information Games
    Brown, Noam
    Sandholm, Tuomas
    Amos, Brandon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Asymmetric co-evolution for imperfect-information zero-sum games
    Halck, OM
    Dahl, FA
    MACHINE LEARNING: ECML 2000, 2000, 1810 : 171 - 182
  • [3] Solving imperfect-information games
    Sandholm, Tuomas
    SCIENCE, 2015, 347 (6218) : 122 - 123
  • [4] Limited Lookahead in Imperfect-Information Games
    Kroer, Christian
    Sandholm, Tuomas
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 575 - 581
  • [5] Limited lookahead in imperfect-information games
    Kroer, Christian
    Sandholm, Tuomas
    ARTIFICIAL INTELLIGENCE, 2020, 283
  • [6] Endgame Solving in Large Imperfect-Information Games
    Ganzfried, Sam
    Sandholm, Tuomas
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 37 - 45
  • [7] Uniform continuity of the value of zero-sum games with differential information
    Einy, Ezra
    Haimanko, Ori
    Moreno, Diego
    Shitovitz, Benyamin
    MATHEMATICS OF OPERATIONS RESEARCH, 2008, 33 (03) : 552 - 560
  • [8] Safe and Nested Subgame Solving for Imperfect-Information Games
    Brown, Noam
    Sandholm, Tuomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [9] Iterative Algorithm for Solving Two-player Zero-sum Extensive-form Games with Imperfect Information
    Bosansky, Branislav
    Kiekintveld, Christopher
    Lisy, Viliam
    Pechoucek, Michal
    20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 193 - +
  • [10] CONTINUITY PROPERTIES OF VALUE FUNCTIONS IN INFORMATION STRUCTURES FOR ZERO-SUM AND GENERAL GAMES AND STOCHASTIC TEAMS*
    Hogeboom-Burr, Ian
    Yuksel, Serdar
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (02) : 398 - 414