Value functions for depth-limited solving in zero-sum imperfect-information games

被引：3

作者：

Kovarik, Vojtech ^{[1
]}

Seitz, Dominik ^{[1
]}

Lisy, Viliam ^{[1
]}

Rudolf, Jan ^{[1
]}

Sun, Shuo ^{[1
]}

Ha, Karel ^{[1
]}

机构：

[1] Czech Tech Univ, Artificial Intelligence Ctr, FEE, Prague, Czech Republic

来源：

ARTIFICIAL INTELLIGENCE | 2023年 / 314卷

关键词：

Imperfect information game; Multiagent reinforcement learning; Extensive form game; Partially observable stochastic game; Depth limited game; Depth limited solving; Value function; Counterfactual regret minimization;

D O I：

10.1016/j.artint.2022.103805

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously miss-ing in imperfect-information games. The definition works for an arbitrary (perfect recall) extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games is value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual re-gret minimization. In three distinct domains, we show that the algorithm's exploitability is roughly linearly dependent on the value network's quality and that it is not difficult to train a value network with which depth-limited CFR's performance is as good as that of CFR with access to the full game.(c) 2022 Published by Elsevier B.V.

引用

页数：51

共 50 条

[31] Linear Programming Modeling for Solving Fuzzy Zero-Sum Games
Briao, Stephanie Loi
Dimuro, Gracaliz Pereira
Santos Machado, Catia Maria
2013 2ND WORKSHOP-SCHOOL ON THEORETICAL COMPUTER SCIENCE (WEIT), 2013, : 84 - 91
[32] Automatically designing counterfactual regret minimization algorithms for solving imperfect-information games
Li, Kai
Xu, Hang
Fu, Haobo
Fu, Qiang
Xing, Junliang
ARTIFICIAL INTELLIGENCE, 2024, 337
[33] A PROBABILISTIC REPRESENTATION FOR THE VALUE OF ZERO-SUM DIFFERENTIAL GAMES WITH INCOMPLETE INFORMATION ON BOTH SIDES
Gensbittel, Fabien
Rainer, Catherine
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2017, 55 (02) : 693 - 723
[34] Learning Strategies for Imperfect Information Board Games Using Depth-Limited Counterfactual Regret Minimization and Belief State
Chen, Chen
Kaneko, Tomoyuki
2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 486 - 493
[35] Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information
Wiggers, Auke J.
Oliehoek, Frans A.
Roijers, Diederik M.
ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1628 - 1629
[36] An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information
Bosansky, Branislav
Kiekintveld, Christopher
Lisy, Viliam
Pechoucek, Michal
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 829 - 866
[37] The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information
Fredrik A. Dahl
Machine Learning, 2002, 49 : 5 - 37
[38] The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
Dahl, FA
MACHINE LEARNING, 2002, 49 (01) : 5 - 37
[39] Evaluating Information in Zero-Sum Games with Incomplete Information on Both Sides
De Meyer, Bernard
Lehrer, Ehud
Rosenberg, Dinah
MATHEMATICS OF OPERATIONS RESEARCH, 2010, 35 (04) : 851 - 863
[40] Heuristic Search Value Iteration for Zero-Sum Stochastic Games
Buffet, Olivier
Dibangoye, Jilles
Saffidine, Abdallah
Thomas, Vincent
IEEE TRANSACTIONS ON GAMES, 2021, 13 (03) : 239 - 248

← 1 2 3 4 5 →