Learning self-play agents for combinatorial optimization problems

被引:6
|
作者
Xu, Ruiyang [1 ]
Lieberherr, Karl [1 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
来源
关键词
GAME; GO;
D O I
10.1017/S026988892000020X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in reinforcement learning (RL) using self-play has shown remarkable performance with several board games (e.g., Chess and Go) and video games (e.g., Atari games and Dota2). It is plausible to hypothesize that RL, starting from zero knowledge, might be able to gradually approach a winning strategy after a certain amount of training. In this paper, we explore neural Monte Carlo Tree Search (neural MCTS), an RL algorithm that has been applied successfully by DeepMind to play Go and Chess at a superhuman level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problems. A specially designed neural MCTS algorithm is then introduced to train Zermelo game agents. We use a prototype problem for which the ground-truth policy is efficiently computable to demonstrate that neural MCTS is promising.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
    Liu, Qinghua
    Yu, Tiancheng
    Bai, Yu
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [22] TIYUNTSONG: A SELF-PLAY REINFORCEMENT LEARNING APPROACH FOR ABR VIDEO STREAMING
    Huang, Tianchi
    Yao, Xin
    Wu, Chenglei
    Zhang, Rui-Xiao
    Pang, Zhengyuan
    Sun, Lifeng
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1678 - 1683
  • [23] Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
    Kim, Dae-Wook
    Park, Sungyun
    Yang, Seong-il
    2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 576 - 583
  • [24] Abalearn: A risk-sensitive approach to self-play learning in abalone
    Campos, P
    Langlois, T
    MACHINE LEARNING: ECML 2003, 2003, 2837 : 35 - 46
  • [25] Zwei: A Self-Play Reinforcement Learning Framework for Video Transmission Services
    Huang, Tianchi
    Zhang, Rui-Xiao
    Sun, Lifeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1350 - 1365
  • [26] Towards Learning Multi-agent Negotiations via Self-Play
    Tang, Yichuan Charlie
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2427 - 2435
  • [27] Learning Existing Social Conventions via Observationally Augmented Self-Play
    Lerer, Adam
    Peysakhovich, Alexander
    AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 107 - 114
  • [28] Learning of Evaluation Functions via Self-Play Enhanced by Checkmate Search
    Nakayashiki, Taichi
    Kaneko, Tomoyuki
    2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 126 - 131
  • [29] Learning Diverse Risk Preferences in Population-Based Self-Play
    Jiang, Yuhua
    Liu, Qihan
    Ma, Xiaoteng
    Li, Chenghao
    Yang, Yiqin
    Yang, Jun
    Liang, Bin
    Zhao, Qianchuan
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12910 - 12918
  • [30] Learning Algorithms with Self-Play: A New Approach to the Distributed Directory Problem
    Khanchandani, Pankaj
    Richter, Oliver
    Rusch, Lukas
    Wattenhofer, Roger
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 501 - 505