Converging to a Player Model In Monte-Carlo Tree Search

被引:0
|
作者
Sarratt, Trevor [1 ]
Pynadath, David V. [2 ]
Jhala, Arnav [1 ]
机构
[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[2] USC Inst Creat Technol, Los Angeles, CA 90094 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Player models allow search algorithms to account for differences in agent behavior according to player's preferences and goals. However, it is often not until the first actions are taken that an agent can begin assessing which models are relevant to its current opponent. This paper investigates the integration of belief distributions over player models in the Monte-Carlo Tree Search (MCTS) algorithm. We describe a method of updating belief distributions through leveraging information sampled during the MCTS. We then characterize the effect of tuning parameters of the MCTS to convergence of belief distributions. Evaluation of this approach is done in comparison with value iteration for an iterated version of the prisoner's dilemma problem. We show that for a sufficient quantity of iterations, our approach converges to the correct model faster than the same model under value iteration.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Can Monte-Carlo Tree Search learn to sacrifice?
    Companez, Nathan
    Aleti, Aldeida
    [J]. JOURNAL OF HEURISTICS, 2016, 22 (06) : 783 - 813
  • [42] Bayesian Optimization for Backpropagation in Monte-Carlo Tree Search
    Lim, Nengli
    Li, Yueqin
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 209 - 221
  • [43] Monte-Carlo Tree Search for the Maximum Satisfiability Problem
    Goffinet, Jack
    Ramanujan, Raghuram
    [J]. PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, CP 2016, 2016, 9892 : 251 - 267
  • [44] Parallel Monte-Carlo Tree Search for HPC Systems
    Graf, Tobias
    Lorenz, Ulf
    Platzner, Marco
    Schaefers, Lars
    [J]. EURO-PAR 2011 PARALLEL PROCESSING, PT 2, 2011, 6853 : 365 - 376
  • [45] Monte-Carlo tree search as regularized policy optimization
    Grill, Jean-Bastien
    Altche, Florent
    Tang, Yunhao
    Hubert, Thomas
    Valko, Michal
    Antonoglou, Ioannis
    Munos, Remi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [46] Using evaluation functions in Monte-Carlo Tree Search
    Lorentz, Richard
    [J]. THEORETICAL COMPUTER SCIENCE, 2016, 644 : 106 - 113
  • [47] Backpropagation Modification in Monte-Carlo Game Tree Search
    Xie, Fan
    Liu, Zhiqing
    [J]. 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 2, PROCEEDINGS, 2009, : 125 - 128
  • [48] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Ertel, Wolfgang
    Viet-Hung Dang
    Chung, TaeChoong
    [J]. APPLIED INTELLIGENCE, 2013, 39 (02) : 345 - 353
  • [49] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Wolfgang Ertel
    Viet-Hung Dang
    TaeChoong Chung
    [J]. Applied Intelligence, 2013, 39 : 345 - 353
  • [50] Monte-Carlo Tree Search in Dragline Operation Planning
    Liu, Haoquan
    Austin, Kevin
    Forbes, Michael
    Kearney, Michael
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (01): : 419 - 425