Learning self-play agents for combinatorial optimization problems

被引：6

作者：

Xu, Ruiyang ^{[1
]}

Lieberherr, Karl ^{[1
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA

来源：

KNOWLEDGE ENGINEERING REVIEW | 2020年 / 35卷

关键词：

GAME; GO;

D O I：

10.1017/S026988892000020X

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent progress in reinforcement learning (RL) using self-play has shown remarkable performance with several board games (e.g., Chess and Go) and video games (e.g., Atari games and Dota2). It is plausible to hypothesize that RL, starting from zero knowledge, might be able to gradually approach a winning strategy after a certain amount of training. In this paper, we explore neural Monte Carlo Tree Search (neural MCTS), an RL algorithm that has been applied successfully by DeepMind to play Go and Chess at a superhuman level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problems. A specially designed neural MCTS algorithm is then introduced to train Zermelo game agents. We use a prototype problem for which the ground-truth policy is efficiently computable to demonstrate that neural MCTS is promising.

引用

页数：18

共 50 条

[1] Learning Self-Game-Play Agents for Combinatorial Optimization Problems
Xu, Ruiyang
Lieberherr, Karl
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2276 - 2278
[2] Self-play Reinforcement Learning for Video Transmission
Huang, Tianchi
Zhang, Rui-Xiao
Sun, Lifeng
NOSSDAV '20: PROCEEDINGS OF THE 2020 WORKSHOP ON NETWORK AND OPERATING SYSTEM SUPPORT FOR DIGITAL AUDIO AND VIDEO, 2020, : 7 - 13
[3] Learning to Drive via Asymmetric Self-Play
Zhang, Chris
Biswas, Sourav
Wong, Kelvin
Fallah, Kion
Zhang, Lunjun
Chen, Dian
Casas, Sergio
Urtasun, Raquel
COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 149 - 168
[4] Self-play reinforcement learning guides protein engineering
Yi Wang
Hui Tang
Lichao Huang
Lulu Pan
Lixiang Yang
Huanming Yang
Feng Mu
Meng Yang
Nature Machine Intelligence, 2023, 5 : 845 - 860
[5] Self-Play Reinforcement Learning for Fast Image Retargeting
Kajiura, Nobukatsu
Kosugi, Satoshi
Wang, Xueting
Yamasaki, Toshihiko
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1755 - 1763
[6] Near-Optimal Reinforcement Learning with Self-Play
Bai, Yu
Jin, Chi
Yu, Tiancheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[7] Provable Self-Play Algorithms for Competitive Reinforcement Learning
Bai, Yu
Jin, Chi
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[8] Self-play reinforcement learning guides protein engineering
Wang, Yi
Tang, Hui
Huang, Lichao
Pan, Lulu
Yang, Lixiang
Yang, Huanming
Mu, Feng
Yang, Meng
NATURE MACHINE INTELLIGENCE, 2023, 5 (08) : 845 - +
[9] A Self-Play Policy Optimization Approach to Battling Pok ′emon
Huang, Dan
Lee, Scott
2019 IEEE CONFERENCE ON GAMES (COG), 2019,
[10] Mastering construction heuristics with self-play deep reinforcement learning
Wang, Qi
He, Yuqing
Tang, Chunlei
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (06): : 4723 - 4738

← 1 2 3 4 5 →