Monotonic Model Improvement Self-play Algorithm for Adversarial Games

被引:0
|
作者
Sundar, Poorna Syama [1 ]
Vasam, Manjunath [1 ]
Joseph, Ajin George [1 ]
机构
[1] Indian Inst Technol Tirupati, Dept Comp Sci & Engn, Tirupati, Andhra Pradesh, India
关键词
D O I
10.1109/CDC49753.2023.10383417
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of solving strategy games has intrigued the scientific community for centuries. In this paper, we consider two-player adversarial zero-sum symmetric games with zero information loss. Here, both players are continuously attempting to make decisions that will change the current game state to his/her advantage and hence the gains of one player are always equal to the losses of the other player. In this paper, we propose a model improvement self-play algorithm, where the agent iteratively switches roles to subdue the current adversary strategy. This monotonic improvement sequence leads to the ultimate development of a monolithic, competent absolute no-loss policy for the game environment. This tactic is the first of its kind in the setting of two-player adversarial games. Our approach could perform competitively and sometimes expertly in games such as 4x4 tic-tac-toe, 5x5 domineering, cram, and dots & boxes with a minimum number of moves.
引用
收藏
页码:5600 / 5605
页数:6
相关论文
共 50 条
  • [21] Learning self-play agents for combinatorial optimization problems
    Xu, Ruiyang
    Lieberherr, Karl
    KNOWLEDGE ENGINEERING REVIEW, 2020, 35
  • [22] Self-Play Reinforcement Learning for Fast Image Retargeting
    Kajiura, Nobukatsu
    Kosugi, Satoshi
    Wang, Xueting
    Yamasaki, Toshihiko
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1755 - 1763
  • [23] Near-Optimal Reinforcement Learning with Self-Play
    Bai, Yu
    Jin, Chi
    Yu, Tiancheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [24] Optimization of Wireless Ad Hoc Network Node Layout Self-play Based on AlphaZero Algorithm
    Zou, Xiaofei
    Yang, Ruopeng
    Yin, Changsheng
    Wang, Xuefeng
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 334 - 337
  • [25] Provable Self-Play Algorithms for Competitive Reinforcement Learning
    Bai, Yu
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [26] Self-play reinforcement learning guides protein engineering
    Wang, Yi
    Tang, Hui
    Huang, Lichao
    Pan, Lulu
    Yang, Lixiang
    Yang, Huanming
    Mu, Feng
    Yang, Meng
    NATURE MACHINE INTELLIGENCE, 2023, 5 (08) : 845 - +
  • [27] A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games
    Li ZHANG
    Yuxuan CHEN
    Wei WANG
    Ziliang HAN
    Shijian Li
    Zhijie PAN
    Gang PAN
    Frontiers of Computer Science, 2021, (05) : 135 - 148
  • [28] Solving Large-Scale Extensive-Form Network Security Games via Neural Fictitious Self-Play
    Xue, Wanqi
    Zhang, Youzhi
    Li, Shuxin
    Wang, Xinrun
    An, Bo
    Yeo, Chai Kiat
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3713 - 3720
  • [29] Mastering construction heuristics with self-play deep reinforcement learning
    Wang, Qi
    He, Yuqing
    Tang, Chunlei
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (06): : 4723 - 4738
  • [30] Anytime Self-play Learning to Satisfy Functional Optimality Criteria
    Burkov, Andriy
    Chaib-draa, Brahim
    ALGORITHMIC DECISION THEORY, PROCEEDINGS, 2009, 5783 : 446 - 457