The Feasibility of Deep Counterfactual Regret Minimisation for Trading Card Games

被引：0

作者：

Adams, David ^{[1
]}

机构：

[1] Univ Western Australia, Perth, WA 6009, Australia

来源：

AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年 / 13728卷

关键词：

Artificial intelligence; Machine learning; Extensive-form games; CHESS; POKER; GO;

D O I：

10.1007/978-3-031-22695-3_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Counterfactual Regret Minimisation (CFR) is the leading technique for approximating Nash Equilibria in imperfect information games. It was an integral part of Libratus, the first AI to beat professionals at Heads-up No-limit Texas-holdem Poker. However, current implementations of CFR rely on a tabular game representation and hand-crafted abstractions to reduce the state space, limiting their ability to scale to larger and more complex games. More recently, techniques such as Deep CFR (DCFR), Variance-Reduction Monte-carlo CFR (VR-MCCFR) and Double Neural CFR (DN-CFR) have been proposed to alleviate CFR's shortcomings by both learning the game state and reducing the overall computation through aggressive sampling. To properly test potential performance improvements, a class of game harder than Poker is required, especially considering current agents are already at superhuman levels. The trading card game Yu-Gi-Oh was selected as its game interactions are highly sophisticated, the overall state space is many orders of magnitude higher than Poker and there are existing simulator implementations. It also introduces the concept of a meta-strategy, where a player strategically chooses a specific set of cards from a large pool to play. Overall, this work seeks to evaluate whether newer CFR methods scale to harder games by comparing the relative performance of existing techniques such as regular CFR and Heuristic agents to the newer DCFR whilst also seeing if these agents can provide automated evaluation of meta-strategies.

引用

页码：145 / 160

页数：16

共 23 条

[1] Deep Counterfactual Regret Minimization
Brown, Noam
Lerer, Adam
Gross, Sam
Sandholm, Tuomas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[2] Counterfactual Regret Minimization in Sequential Security Games
Lisy, Viliam
Davis, Trevor
Bowling, Michael
[J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 544 - 550
[3] Solving Poker Games Efficiently: Adaptive Memory based Deep Counterfactual Regret Minimization
Shi, Shuqing
Wang, Xiaobin
Hao, Dong
Yang, Zhiyou
Qu, Hong
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[4] RLCFR: Minimize counterfactual regret by deep reinforcement learning
Li, Huale
Wang, Xuan
Jia, Fengwei
Wu, Yulin
Zhang, Jiajia
Qi, Shuhan
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
[5] Deep Counterfactual Regret Minimization Algorithm with Regret Discount in Radar Anti-Jamming Game
Xu, Yifei
Zhang, Jiahua
Tian, Feng
[J]. International Conference on Communication Technology Proceedings, ICCT, 2023, : 1754 - 1758
[6] Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games
Lisy, Viliam
Lanctot, Marc
Bowling, Michael
[J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 27 - 36
[7] Improving Counterfactual Regret Minimization Agents Training in Card Game Cheat Using Ordered Abstraction
Yi, Cheng
Kaneko, Tomoyuki
[J]. ADVANCES IN COMPUTER GAMES, ACG 2021, 2022, 13262 : 3 - 13
[8] Automatically designing counterfactual regret minimization algorithms for solving imperfect-information games
Li, Kai
Xu, Hang
Fu, Haobo
Fu, Qiang
Xing, Junliang
[J]. Artificial Intelligence, 2024, 337
[9] The lure of the sorcerer: Consumers' consumption meanings in the context of trading card games
Martin, Brett A. S.
[J]. JOURNAL OF STRATEGIC MARKETING, 2019, 27 (02) : 151 - 163
[10] Cheat-Proof Peer-to-Peer Trading Card Games
Pittman, Daniel
GauthierDickey, Chris
[J]. 2011 10TH ANNUAL WORKSHOP ON NETWORK AND SYSTEMS SUPPORT FOR GAMES (NETGAMES 2011), 2011,

← 1 2 3 →