LinUCB applied to Monte Carlo tree search

被引:3
|
作者
Mandai, Yusaku [1 ]
Kaneko, Tomoyuki [1 ]
机构
[1] Univ Tokyo, Grad Sch Arts & Sci, Tokyo, Japan
关键词
MCTS; Multi-armed bandit problem; Contextual bandit; LinUCB; COMPUTER GO;
D O I
10.1016/j.tcs.2016.06.035
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
UCT is a standard method of Monte Carlo tree search (MCTS) algorithms, which have been applied to various domains and have achieved remarkable success. This study proposes a family of LinUCT algorithms that incorporate LinUCB into MCTS algorithms. LinUCB is a recently developed method that generalizes past episodes by ridge regression with feature vectors and rewards. LinUCB outperforms UCB1 in contextual multi-armed bandit problems. We introduce a straightforward application of LinUCB, LinUCTPLAIN by substituting UCB1 with LinUCB in UCT. We show that it does not work well owing to the minimax structure of game trees. To better handle such tree structures, we present LinUCTaAve and LinUCTFp by further incorporating two existing techniques, rapid action value estimation (RAVE) and feature propagation, which recursively propagates the feature vector of a node to that of its parent. Experiments were conducted with a synthetic model, which is an extension of the standard incremental random tree model in which each node has a feature vector that represents the characteristics of the corresponding position, and Finnsson's shock step game which is used to empirically analyze the performance of UCT with respect to the distribution of suboptimal moves. The experiments results indicate that LinUCTRAve and LinUCTFp outperform UCT, especially when the branching factor is relatively large. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:114 / 126
页数:13
相关论文
共 50 条
  • [1] LinUCB Applied to Monte-Carlo Tree Search
    Mandai, Yusaku
    Kaneko, Tomoyuki
    [J]. ADVANCES IN COMPUTER GAMES, ACG 2015, 2015, 9525 : 41 - 52
  • [2] Monte Carlo Tree Search Applied to Co-operative Problems
    Williams, Piers R.
    Walton-Rivers, Joseph
    Perez-Liebana, Diego
    Lucas, Simon M.
    [J]. 2015 7TH COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC), 2015, : 219 - 224
  • [3] Multiagent Monte Carlo Tree Search
    Zerbel, Nicholas
    Yliniemi, Logan
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2309 - 2311
  • [4] Monte Carlo Tree Search with Metaheuristics
    Mandziuk, Jacek
    Walczak, Patryk
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2023, PT II, 2023, 14126 : 134 - 144
  • [5] Elastic Monte Carlo Tree Search
    Xu, Linjie
    Dockhorn, Alexander
    Perez-Liebana, Diego
    [J]. IEEE TRANSACTIONS ON GAMES, 2023, 15 (04) : 527 - 537
  • [6] Monte Carlo Tree Search in Hex
    Arneson, Broderick
    Hayward, Ryan B.
    Henderson, Philip
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2010, 2 (04) : 251 - 258
  • [7] Monte Carlo tree search in Kriegspiel
    Ciancarini, Paolo
    Favini, Gian Piero
    [J]. ARTIFICIAL INTELLIGENCE, 2010, 174 (11) : 670 - 684
  • [8] Monte Carlo Tree Search for Quoridor
    Respall, Victor Massague
    Brown, Joseph Alexander
    Aslam, Hamna
    [J]. 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT GAMES AND SIMULATION (GAME-ON(R) 2018), 2018, : 5 - 9
  • [9] An Analysis of Monte Carlo Tree Search
    James, Steven
    Konidaris, George
    Rosman, Benjamin
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3576 - 3582
  • [10] MONTE CARLO TREE SEARCH: A TUTORIAL
    Fu, Michael C.
    [J]. 2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 222 - 236