Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability

被引:0
|
作者
MacQueen, Revan [1 ,2 ]
Wright, James R. [1 ,2 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] Amii, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
REINFORCEMENT; LEVEL; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-play is a technique for machine learning in multi-agent systems where a learning algorithm learns by interacting with copies of itself. Self-play is useful for generating large quantities of data for learning, but has the drawback that the agents the learner will face post-training may have dramatically different behavior than the learner came to expect by interacting with itself. For the special case of two-player constant-sum games, self-play that reaches Nash equilibrium is guaranteed to produce strategies that perform well against any post-training opponent; however, no such guarantee exists for multiplayer games. We show that in games that approximately decompose into a set of two-player constant-sum games (called constant-sum polymatrix games) where global epsilon-Nash equilibria are boundedly far from Nash equilibria in each subgame (called subgame stability), any no-external-regret algorithm that learns by self-play will produce a strategy with bounded vulnerability. For the first time, our results identify a structural property of multiplayer games that enable performance guarantees for the strategies produced by a broad class of self-play algorithms. We demonstrate our findings through experiments on Leduc poker.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Fictitious Self-Play in Extensive-Form Games
    Heinrich, Johannes
    Lanctot, Marc
    Silver, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 805 - 813
  • [2] Temporal Induced Self-Play for Stochastic Bayesian Games
    Chen, Weizhe
    Zhou, Zihan
    Wu, Yi
    Fang, Fei
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 96 - 103
  • [3] Optimal Strategy for Aircraft Pursuit-evasion Games via Self-play Iteration
    Wang, Xin
    Wei, Qing-Lai
    Li, Tao
    Zhang, Jie
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (03) : 585 - 596
  • [4] Learning to Drive via Asymmetric Self-Play
    Zhang, Chris
    Biswas, Sourav
    Wong, Kelvin
    Fallah, Kion
    Zhang, Lunjun
    Chen, Dian
    Casas, Sergio
    Urtasun, Raquel
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 149 - 168
  • [5] Extracting tactics learned from self-play in general games
    Soemers, Dennis J. N. J.
    Samothrakis, Spyridon
    Piette, Eric
    Stephenson, Matthew
    INFORMATION SCIENCES, 2023, 624 : 277 - 298
  • [6] Self-play reinforcement learning with comprehensive critic in computer games
    Liu, Shanqi
    Cao, Junjie
    Wang, Yujie
    Chen, Wenzhou
    Liu, Yong
    NEUROCOMPUTING, 2021, 449 : 207 - 213
  • [7] Monotonic Model Improvement Self-play Algorithm for Adversarial Games
    Sundar, Poorna Syama
    Vasam, Manjunath
    Joseph, Ajin George
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 5600 - 5605
  • [8] Neural Fictitious Self-Play in Imperfect Information Games with Many Players
    Kawamura, Keigo
    Mizukami, Naoki
    Tsuruoka, Yoshimasa
    COMPUTER GAMES (CGW 2017), 2018, 818 : 61 - 74
  • [9] A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
    Xiong, Wei
    Zhong, Han
    Shi, Chengshuai
    Shen, Cong
    Zhang, Tong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [10] Self-play: Statistical significance
    Haworth, GM
    ICGA JOURNAL, 2003, 26 (02) : 115 - 118