Value functions for depth-limited solving in zero-sum imperfect-information games

被引:3
|
作者
Kovarik, Vojtech [1 ]
Seitz, Dominik [1 ]
Lisy, Viliam [1 ]
Rudolf, Jan [1 ]
Sun, Shuo [1 ]
Ha, Karel [1 ]
机构
[1] Czech Tech Univ, Artificial Intelligence Ctr, FEE, Prague, Czech Republic
关键词
Imperfect information game; Multiagent reinforcement learning; Extensive form game; Partially observable stochastic game; Depth limited game; Depth limited solving; Value function; Counterfactual regret minimization;
D O I
10.1016/j.artint.2022.103805
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously miss-ing in imperfect-information games. The definition works for an arbitrary (perfect recall) extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games is value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual re-gret minimization. In three distinct domains, we show that the algorithm's exploitability is roughly linearly dependent on the value network's quality and that it is not difficult to train a value network with which depth-limited CFR's performance is as good as that of CFR with access to the full game.(c) 2022 Published by Elsevier B.V.
引用
收藏
页数:51
相关论文
共 50 条
  • [31] Linear Programming Modeling for Solving Fuzzy Zero-Sum Games
    Briao, Stephanie Loi
    Dimuro, Gracaliz Pereira
    Santos Machado, Catia Maria
    2013 2ND WORKSHOP-SCHOOL ON THEORETICAL COMPUTER SCIENCE (WEIT), 2013, : 84 - 91
  • [32] Automatically designing counterfactual regret minimization algorithms for solving imperfect-information games
    Li, Kai
    Xu, Hang
    Fu, Haobo
    Fu, Qiang
    Xing, Junliang
    ARTIFICIAL INTELLIGENCE, 2024, 337
  • [33] A PROBABILISTIC REPRESENTATION FOR THE VALUE OF ZERO-SUM DIFFERENTIAL GAMES WITH INCOMPLETE INFORMATION ON BOTH SIDES
    Gensbittel, Fabien
    Rainer, Catherine
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2017, 55 (02) : 693 - 723
  • [34] Learning Strategies for Imperfect Information Board Games Using Depth-Limited Counterfactual Regret Minimization and Belief State
    Chen, Chen
    Kaneko, Tomoyuki
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 486 - 493
  • [35] Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information
    Wiggers, Auke J.
    Oliehoek, Frans A.
    Roijers, Diederik M.
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1628 - 1629
  • [36] An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information
    Bosansky, Branislav
    Kiekintveld, Christopher
    Lisy, Viliam
    Pechoucek, Michal
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 829 - 866
  • [37] The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information
    Fredrik A. Dahl
    Machine Learning, 2002, 49 : 5 - 37
  • [38] The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
    Dahl, FA
    MACHINE LEARNING, 2002, 49 (01) : 5 - 37
  • [39] Evaluating Information in Zero-Sum Games with Incomplete Information on Both Sides
    De Meyer, Bernard
    Lehrer, Ehud
    Rosenberg, Dinah
    MATHEMATICS OF OPERATIONS RESEARCH, 2010, 35 (04) : 851 - 863
  • [40] Heuristic Search Value Iteration for Zero-Sum Stochastic Games
    Buffet, Olivier
    Dibangoye, Jilles
    Saffidine, Abdallah
    Thomas, Vincent
    IEEE TRANSACTIONS ON GAMES, 2021, 13 (03) : 239 - 248