Monotonic Model Improvement Self-play Algorithm for Adversarial Games

被引:0
|
作者
Sundar, Poorna Syama [1 ]
Vasam, Manjunath [1 ]
Joseph, Ajin George [1 ]
机构
[1] Indian Inst Technol Tirupati, Dept Comp Sci & Engn, Tirupati, Andhra Pradesh, India
关键词
D O I
10.1109/CDC49753.2023.10383417
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of solving strategy games has intrigued the scientific community for centuries. In this paper, we consider two-player adversarial zero-sum symmetric games with zero information loss. Here, both players are continuously attempting to make decisions that will change the current game state to his/her advantage and hence the gains of one player are always equal to the losses of the other player. In this paper, we propose a model improvement self-play algorithm, where the agent iteratively switches roles to subdue the current adversary strategy. This monotonic improvement sequence leads to the ultimate development of a monolithic, competent absolute no-loss policy for the game environment. This tactic is the first of its kind in the setting of two-player adversarial games. Our approach could perform competitively and sometimes expertly in games such as 4x4 tic-tac-toe, 5x5 domineering, cram, and dots & boxes with a minimum number of moves.
引用
收藏
页码:5600 / 5605
页数:6
相关论文
共 50 条
  • [11] A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
    Liu, Qinghua
    Yu, Tiancheng
    Bai, Yu
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [12] Efficient Parallel Design for Self-Play in Two-Player Zero-Sum Games
    Tang, Hongsong
    Chen, Bo
    Liu, Yingzhuo
    Han, Kuoye
    Liu, Jingqian
    Qu, Zhaowei
    SYMMETRY-BASEL, 2025, 17 (02):
  • [13] Competing for Pixels: A Self-Play Algorithm for Weakly-Supervised Semantic Segmentation
    Saeed, Shaheer U.
    Huang, Shiqi
    Ramalhinho, Joao
    Gayo, Iani J. M. B.
    Montana-Brown, Nina
    Bonmati, Ester
    Pereira, Stephen P.
    Davidson, Brian
    Barratt, Dean C.
    Clarkson, Matthew J.
    Hu, Yipeng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 825 - 839
  • [14] Self-play Reinforcement Learning for Video Transmission
    Huang, Tianchi
    Zhang, Rui-Xiao
    Sun, Lifeng
    NOSSDAV '20: PROCEEDINGS OF THE 2020 WORKSHOP ON NETWORK AND OPERATING SYSTEM SUPPORT FOR DIGITAL AUDIO AND VIDEO, 2020, : 7 - 13
  • [15] Learning to Drive via Asymmetric Self-Play
    Zhang, Chris
    Biswas, Sourav
    Wong, Kelvin
    Fallah, Kion
    Zhang, Lunjun
    Chen, Dian
    Casas, Sergio
    Urtasun, Raquel
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 149 - 168
  • [16] A Proposal of Score Distribution Predictive Model in Self-Play Deep Reinforcement Learning
    Kagoshima, Kazuya
    Sakaji, Hiroki
    Noda, Itsuki
    Transactions of the Japanese Society for Artificial Intelligence, 2024, 39 (05)
  • [17] A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
    Silver, David
    Hubert, Thomas
    Schrittwieser, Julian
    Antonoglou, Ioannis
    Lai, Matthew
    Guez, Arthur
    Lanctot, Marc
    Sifre, Laurent
    Kumaran, Dharshan
    Graepel, Thore
    Lillicrap, Timothy
    Simonyan, Karen
    Hassabis, Demis
    SCIENCE, 2018, 362 (6419) : 1140 - +
  • [18] Self-play reinforcement learning guides protein engineering
    Yi Wang
    Hui Tang
    Lichao Huang
    Lulu Pan
    Lixiang Yang
    Huanming Yang
    Feng Mu
    Meng Yang
    Nature Machine Intelligence, 2023, 5 : 845 - 860
  • [19] Self-Play for Training General Fighting Game AI
    Takano, Yoshina
    Inoue, Hideyasu
    Thawonmas, Ruck
    Harada, Tomohiro
    2019 NICOGRAPH INTERNATIONAL (NICOINT), 2019, : 120 - 120
  • [20] A Comparison of Self-Play Algorithms Under a Generalized Framework
    Hernandez, Daniel
    Denamganai, Kevin
    Devlin, Sam
    Samothrakis, Spyridon
    Walker, James Alfred
    IEEE TRANSACTIONS ON GAMES, 2022, 14 (02) : 221 - 231