Learning self-play agents for combinatorial optimization problems

被引:6
|
作者
Xu, Ruiyang [1 ]
Lieberherr, Karl [1 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
来源
关键词
GAME; GO;
D O I
10.1017/S026988892000020X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in reinforcement learning (RL) using self-play has shown remarkable performance with several board games (e.g., Chess and Go) and video games (e.g., Atari games and Dota2). It is plausible to hypothesize that RL, starting from zero knowledge, might be able to gradually approach a winning strategy after a certain amount of training. In this paper, we explore neural Monte Carlo Tree Search (neural MCTS), an RL algorithm that has been applied successfully by DeepMind to play Go and Chess at a superhuman level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problems. A specially designed neural MCTS algorithm is then introduced to train Zermelo game agents. We use a prototype problem for which the ground-truth policy is efficiently computable to demonstrate that neural MCTS is promising.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Learning Self-Game-Play Agents for Combinatorial Optimization Problems
    Xu, Ruiyang
    Lieberherr, Karl
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2276 - 2278
  • [2] Self-play Reinforcement Learning for Video Transmission
    Huang, Tianchi
    Zhang, Rui-Xiao
    Sun, Lifeng
    NOSSDAV '20: PROCEEDINGS OF THE 2020 WORKSHOP ON NETWORK AND OPERATING SYSTEM SUPPORT FOR DIGITAL AUDIO AND VIDEO, 2020, : 7 - 13
  • [3] Learning to Drive via Asymmetric Self-Play
    Zhang, Chris
    Biswas, Sourav
    Wong, Kelvin
    Fallah, Kion
    Zhang, Lunjun
    Chen, Dian
    Casas, Sergio
    Urtasun, Raquel
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 149 - 168
  • [4] Self-play reinforcement learning guides protein engineering
    Yi Wang
    Hui Tang
    Lichao Huang
    Lulu Pan
    Lixiang Yang
    Huanming Yang
    Feng Mu
    Meng Yang
    Nature Machine Intelligence, 2023, 5 : 845 - 860
  • [5] Self-Play Reinforcement Learning for Fast Image Retargeting
    Kajiura, Nobukatsu
    Kosugi, Satoshi
    Wang, Xueting
    Yamasaki, Toshihiko
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1755 - 1763
  • [6] Near-Optimal Reinforcement Learning with Self-Play
    Bai, Yu
    Jin, Chi
    Yu, Tiancheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] Provable Self-Play Algorithms for Competitive Reinforcement Learning
    Bai, Yu
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [8] Self-play reinforcement learning guides protein engineering
    Wang, Yi
    Tang, Hui
    Huang, Lichao
    Pan, Lulu
    Yang, Lixiang
    Yang, Huanming
    Mu, Feng
    Yang, Meng
    NATURE MACHINE INTELLIGENCE, 2023, 5 (08) : 845 - +
  • [9] A Self-Play Policy Optimization Approach to Battling Pok ′emon
    Huang, Dan
    Lee, Scott
    2019 IEEE CONFERENCE ON GAMES (COG), 2019,
  • [10] Mastering construction heuristics with self-play deep reinforcement learning
    Wang, Qi
    He, Yuqing
    Tang, Chunlei
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (06): : 4723 - 4738