Learning self-play agents for combinatorial optimization problems

被引：6

作者：

Xu, Ruiyang ^{[1
]}

Lieberherr, Karl ^{[1
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA

来源：

KNOWLEDGE ENGINEERING REVIEW | 2020年 / 35卷

关键词：

GAME; GO;

D O I：

10.1017/S026988892000020X

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent progress in reinforcement learning (RL) using self-play has shown remarkable performance with several board games (e.g., Chess and Go) and video games (e.g., Atari games and Dota2). It is plausible to hypothesize that RL, starting from zero knowledge, might be able to gradually approach a winning strategy after a certain amount of training. In this paper, we explore neural Monte Carlo Tree Search (neural MCTS), an RL algorithm that has been applied successfully by DeepMind to play Go and Chess at a superhuman level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problems. A specially designed neural MCTS algorithm is then introduced to train Zermelo game agents. We use a prototype problem for which the ground-truth policy is efficiently computable to demonstrate that neural MCTS is promising.

引用

页数：18

共 50 条

[21] A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Liu, Qinghua
Yu, Tiancheng
Bai, Yu
Jin, Chi
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[22] TIYUNTSONG: A SELF-PLAY REINFORCEMENT LEARNING APPROACH FOR ABR VIDEO STREAMING
Huang, Tianchi
Yao, Xin
Wu, Chenglei
Zhang, Rui-Xiao
Pang, Zhengyuan
Sun, Lifeng
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1678 - 1683
[23] Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
Kim, Dae-Wook
Park, Sungyun
Yang, Seong-il
2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 576 - 583
[24] Abalearn: A risk-sensitive approach to self-play learning in abalone
Campos, P
Langlois, T
MACHINE LEARNING: ECML 2003, 2003, 2837 : 35 - 46
[25] Zwei: A Self-Play Reinforcement Learning Framework for Video Transmission Services
Huang, Tianchi
Zhang, Rui-Xiao
Sun, Lifeng
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1350 - 1365
[26] Towards Learning Multi-agent Negotiations via Self-Play
Tang, Yichuan Charlie
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2427 - 2435
[27] Learning Existing Social Conventions via Observationally Augmented Self-Play
Lerer, Adam
Peysakhovich, Alexander
AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 107 - 114
[28] Learning of Evaluation Functions via Self-Play Enhanced by Checkmate Search
Nakayashiki, Taichi
Kaneko, Tomoyuki
2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 126 - 131
[29] Learning Diverse Risk Preferences in Population-Based Self-Play
Jiang, Yuhua
Liu, Qihan
Ma, Xiaoteng
Li, Chenghao
Yang, Yiqin
Yang, Jun
Liang, Bin
Zhao, Qianchuan
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12910 - 12918
[30] Learning Algorithms with Self-Play: A New Approach to the Distributed Directory Problem
Khanchandani, Pankaj
Richter, Oliver
Rusch, Lukas
Wattenhofer, Roger
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 501 - 505

← 1 2 3 4 5 →