Monotonic Model Improvement Self-play Algorithm for Adversarial Games

被引:0
|
作者
Sundar, Poorna Syama [1 ]
Vasam, Manjunath [1 ]
Joseph, Ajin George [1 ]
机构
[1] Indian Inst Technol Tirupati, Dept Comp Sci & Engn, Tirupati, Andhra Pradesh, India
关键词
D O I
10.1109/CDC49753.2023.10383417
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of solving strategy games has intrigued the scientific community for centuries. In this paper, we consider two-player adversarial zero-sum symmetric games with zero information loss. Here, both players are continuously attempting to make decisions that will change the current game state to his/her advantage and hence the gains of one player are always equal to the losses of the other player. In this paper, we propose a model improvement self-play algorithm, where the agent iteratively switches roles to subdue the current adversary strategy. This monotonic improvement sequence leads to the ultimate development of a monolithic, competent absolute no-loss policy for the game environment. This tactic is the first of its kind in the setting of two-player adversarial games. Our approach could perform competitively and sometimes expertly in games such as 4x4 tic-tac-toe, 5x5 domineering, cram, and dots & boxes with a minimum number of moves.
引用
收藏
页码:5600 / 5605
页数:6
相关论文
共 50 条
  • [31] A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games
    Li Zhang
    Yuxuan Chen
    Wei Wang
    Ziliang Han
    Shijian Li
    Zhijie Pan
    Gang Pan
    Frontiers of Computer Science, 2021, 15
  • [32] A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games
    Zhang, Li
    Chen, Yuxuan
    Wang, Wei
    Han, Ziliang
    Li, Shijian
    Pan, Zhijie
    Pan, Gang
    FRONTIERS OF COMPUTER SCIENCE, 2021, 15 (05)
  • [33] Reinforcement learning for extended reality: designing self-play scenarios
    Leal, Leonardo A. Espinosa
    Chapman, Anthony
    Westerlund, Magnus
    PROCEEDINGS OF THE 52ND ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2019, : 156 - 163
  • [34] Latent Space Alignment Using Adversarially Guided Self-Play
    Tucker, Mycal
    Zhou, Yilun
    Shah, Julie
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2022, 38 (18-20) : 1753 - 1771
  • [35] Mastering construction heuristics with self-play deep reinforcement learning
    Qi Wang
    Yuqing He
    Chunlei Tang
    Neural Computing and Applications, 2023, 35 : 4723 - 4738
  • [36] Alternative Loss Functions in AlphaZero-like Self-play
    Wang, Hui
    Emmerich, Michael
    Preuss, Mike
    Plaat, Aske
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 155 - 162
  • [37] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
    Zha, Daochen
    Xie, Jingru
    Ma, Wenye
    Zhang, Sheng
    Lian, Xiangru
    Hu, Xia
    Liu, Ji
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [38] Self-play, deep search and diminishing returns - Ken Thompson
    Heinz, EA
    ICGA JOURNAL, 2001, 24 (02) : 75 - 79
  • [39] The Applicability of Self-Play Algorithms to Trading and Forecasting Financial Markets
    Posth, Jan-Alexander
    Kotlarz, Piotr
    Misheva, Branka Hadji
    Osterrieder, Joerg
    Schwendner, Peter
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [40] A Self-Play Policy Optimization Approach to Battling Pok ′emon
    Huang, Dan
    Lee, Scott
    2019 IEEE CONFERENCE ON GAMES (COG), 2019,