Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability

被引:0
|
作者
MacQueen, Revan [1 ,2 ]
Wright, James R. [1 ,2 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] Amii, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
REINFORCEMENT; LEVEL; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-play is a technique for machine learning in multi-agent systems where a learning algorithm learns by interacting with copies of itself. Self-play is useful for generating large quantities of data for learning, but has the drawback that the agents the learner will face post-training may have dramatically different behavior than the learner came to expect by interacting with itself. For the special case of two-player constant-sum games, self-play that reaches Nash equilibrium is guaranteed to produce strategies that perform well against any post-training opponent; however, no such guarantee exists for multiplayer games. We show that in games that approximately decompose into a set of two-player constant-sum games (called constant-sum polymatrix games) where global epsilon-Nash equilibria are boundedly far from Nash equilibria in each subgame (called subgame stability), any no-external-regret algorithm that learns by self-play will produce a strategy with bounded vulnerability. For the first time, our results identify a structural property of multiplayer games that enable performance guarantees for the strategies produced by a broad class of self-play algorithms. We demonstrate our findings through experiments on Leduc poker.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] Mastering construction heuristics with self-play deep reinforcement learning
    Qi Wang
    Yuqing He
    Chunlei Tang
    Neural Computing and Applications, 2023, 35 : 4723 - 4738
  • [42] Alternative Loss Functions in AlphaZero-like Self-play
    Wang, Hui
    Emmerich, Michael
    Preuss, Mike
    Plaat, Aske
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 155 - 162
  • [43] Leveraging Asymmetries in Multiplayer Games: Investigating Design Elements of Interdependent Play
    Harris, John
    Hancock, Mark
    Scott, Stacey D.
    CHI PLAY 2016: PROCEEDINGS OF THE 2016 ANNUAL SYMPOSIUM ON COMPUTER-HUMAN INTERACTION IN PLAY, 2016, : 350 - 361
  • [44] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
    Zha, Daochen
    Xie, Jingru
    Ma, Wenye
    Zhang, Sheng
    Lian, Xiangru
    Hu, Xia
    Liu, Ji
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [45] Self-play, deep search and diminishing returns - Ken Thompson
    Heinz, EA
    ICGA JOURNAL, 2001, 24 (02) : 75 - 79
  • [46] The Applicability of Self-Play Algorithms to Trading and Forecasting Financial Markets
    Posth, Jan-Alexander
    Kotlarz, Piotr
    Misheva, Branka Hadji
    Osterrieder, Joerg
    Schwendner, Peter
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [47] A Self-Play Policy Optimization Approach to Battling Pok ′emon
    Huang, Dan
    Lee, Scott
    2019 IEEE CONFERENCE ON GAMES (COG), 2019,
  • [48] Distributed Reinforcement Learning with Self-Play in Parameterized Action Space
    Ma, Jun
    Yao, Shunyi
    Chen, Guangda
    Song, Jiakai
    Ji, Jianmin
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1178 - 1185
  • [49] MULTIPLAYER GAMES AND HIV TRANSMISSION VIA CASUAL ENCOUNTERS
    Tully, Stephen
    Cojocaru, Monica-Gabriela
    Bauch, Chris T.
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2017, 14 (02) : 359 - 376
  • [50] Finding Effective Security Strategies through Reinforcement Learning and Self-Play
    Hammar, Kim
    Stadler, Rolf
    2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,