Learning self-play agents for combinatorial optimization problems

被引：6

作者：

Xu, Ruiyang ^{[1
]}

Lieberherr, Karl ^{[1
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA

来源：

KNOWLEDGE ENGINEERING REVIEW | 2020年 / 35卷

关键词：

GAME; GO;

D O I：

10.1017/S026988892000020X

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent progress in reinforcement learning (RL) using self-play has shown remarkable performance with several board games (e.g., Chess and Go) and video games (e.g., Atari games and Dota2). It is plausible to hypothesize that RL, starting from zero knowledge, might be able to gradually approach a winning strategy after a certain amount of training. In this paper, we explore neural Monte Carlo Tree Search (neural MCTS), an RL algorithm that has been applied successfully by DeepMind to play Go and Chess at a superhuman level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problems. A specially designed neural MCTS algorithm is then introduced to train Zermelo game agents. We use a prototype problem for which the ground-truth policy is efficiently computable to demonstrate that neural MCTS is promising.

引用

页数：18

共 50 条

[41] A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking
Jia, Yunjie
Song, Yong
Cheng, Jiyu
Jin, Jiong
Zhang, Wei
Yang, Simon X.
Kwong, Sam
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025,
[42] Transfer Reinforcement Learning for Combinatorial Optimization Problems
Souza, Gleice Kelly Barbosa
Santos, Samara Oliveira Silva
Ottoni, Andre Luiz Carvalho
Oliveira, Marcos Santos
Oliveira, Daniela Carine Ramires
Nepomuceno, Erivelton Geraldo
ALGORITHMS, 2024, 17 (02)
[43] Fictitious Self-Play in Extensive-Form Games
Heinrich, Johannes
Lanctot, Marc
Silver, David
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 805 - 813
[44] Self-Play for Training General Fighting Game AI
Takano, Yoshina
Inoue, Hideyasu
Thawonmas, Ruck
Harada, Tomohiro
2019 NICOGRAPH INTERNATIONAL (NICOINT), 2019, : 120 - 120
[45] A Comparison of Self-Play Algorithms Under a Generalized Framework
Hernandez, Daniel
Denamganai, Kevin
Devlin, Sam
Samothrakis, Spyridon
Walker, James Alfred
IEEE TRANSACTIONS ON GAMES, 2022, 14 (02) : 221 - 231
[46] Temporal Induced Self-Play for Stochastic Bayesian Games
Chen, Weizhe
Zhou, Zihan
Wu, Yi
Fang, Fei
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 96 - 103
[47] Optimization of Wireless Ad Hoc Network Node Layout Self-play Based on AlphaZero Algorithm
Zou, Xiaofei
Yang, Ruopeng
Yin, Changsheng
Wang, Xuefeng
2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 334 - 337
[48] Machine Learning Approaches to Learning Heuristics for Combinatorial Optimization Problems
Mirshekarian, Sadegh
Sormaz, Dusan
28TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2018): GLOBAL INTEGRATION OF INTELLIGENT MANUFACTURING AND SMART INDUSTRY FOR GOOD OF HUMANITY, 2018, 17 : 102 - 109
[49] Do as you teach: a multi-teacher approach to self-play in deep reinforcement learning
Chaitanya Kharyal
Sai Krishna Gottipati
Tanmay Kumar Sinha
Fatemeh Abdollahi
Srijita Das
Matthew E. Taylor
Neural Computing and Applications, 2025, 37 (8) : 5945 - 5956
[50] Multiagent Reinforcement Learning for Strategic Decision Making and Control in Robotic Soccer Through Self-Play
Brandao, Bruno
De Lima, Telma Woerle
Soares, Anderson
Melo, Luckeciano
Maximo, Marcos R. O. A.
IEEE ACCESS, 2022, 10 : 72628 - 72642

← 1 2 3 4 5 →