Converging to a Player Model In Monte-Carlo Tree Search

被引：0

作者：

Sarratt, Trevor ^{[1
]}

Pynadath, David V. ^{[2
]}

Jhala, Arnav ^{[1
]}

机构：

[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA

[2] USC Inst Creat Technol, Los Angeles, CA 90094 USA

来源：

2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Player models allow search algorithms to account for differences in agent behavior according to player's preferences and goals. However, it is often not until the first actions are taken that an agent can begin assessing which models are relevant to its current opponent. This paper investigates the integration of belief distributions over player models in the Monte-Carlo Tree Search (MCTS) algorithm. We describe a method of updating belief distributions through leveraging information sampled during the MCTS. We then characterize the effect of tuning parameters of the MCTS to convergence of belief distributions. Evaluation of this approach is done in comparison with value iteration for an iterated version of the prisoner's dilemma problem. We show that for a sufficient quantity of iterations, our approach converges to the correct model faster than the same model under value iteration.

引用

页数：7

共 50 条

[41] Can Monte-Carlo Tree Search learn to sacrifice?
Companez, Nathan
Aleti, Aldeida
[J]. JOURNAL OF HEURISTICS, 2016, 22 (06) : 783 - 813
[42] Bayesian Optimization for Backpropagation in Monte-Carlo Tree Search
Lim, Nengli
Li, Yueqin
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 209 - 221
[43] Monte-Carlo Tree Search for the Maximum Satisfiability Problem
Goffinet, Jack
Ramanujan, Raghuram
[J]. PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, CP 2016, 2016, 9892 : 251 - 267
[44] Parallel Monte-Carlo Tree Search for HPC Systems
Graf, Tobias
Lorenz, Ulf
Platzner, Marco
Schaefers, Lars
[J]. EURO-PAR 2011 PARALLEL PROCESSING, PT 2, 2011, 6853 : 365 - 376
[45] Monte-Carlo tree search as regularized policy optimization
Grill, Jean-Bastien
Altche, Florent
Tang, Yunhao
Hubert, Thomas
Valko, Michal
Antonoglou, Ioannis
Munos, Remi
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[46] Using evaluation functions in Monte-Carlo Tree Search
Lorentz, Richard
[J]. THEORETICAL COMPUTER SCIENCE, 2016, 644 : 106 - 113
[47] Backpropagation Modification in Monte-Carlo Game Tree Search
Xie, Fan
Liu, Zhiqing
[J]. 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 2, PROCEEDINGS, 2009, : 125 - 128
[48] Monte-Carlo tree search for Bayesian reinforcement learning
Ngo Anh Vien
Ertel, Wolfgang
Viet-Hung Dang
Chung, TaeChoong
[J]. APPLIED INTELLIGENCE, 2013, 39 (02) : 345 - 353
[49] Monte-Carlo tree search for Bayesian reinforcement learning
Ngo Anh Vien
Wolfgang Ertel
Viet-Hung Dang
TaeChoong Chung
[J]. Applied Intelligence, 2013, 39 : 345 - 353
[50] Monte-Carlo Tree Search in Dragline Operation Planning
Liu, Haoquan
Austin, Kevin
Forbes, Michael
Kearney, Michael
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (01): : 419 - 425

← 1 2 3 4 5 →