Non-Asymptotic Analysis of Monte Carlo Tree Search

被引:3
|
作者
Shah D. [1 ]
Xie Q. [2 ]
Xu Z. [1 ]
机构
[1] LIDS, MIT, Cambridge, MA
[2] ORIE, Cornell University, Ithaca, NY
来源
Performance Evaluation Review | 2020年 / 48卷 / 01期
关键词
D O I
10.1145/3393691.3394202
中图分类号
学科分类号
摘要
In this work, we consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite-horizon discounted cost Markov Decision Process (MDP) with deterministic transitions. While MCTS is believed to provide an approximate value function for a given state with enough simulations, cf. [5, 6], the claimed proof of this property is incomplete. This is due to the fact that the variant of MCTS, the Upper Confidence Bound for Trees (UCT), analyzed in prior works utilizes "logarithmic"bonus term for balancing exploration and exploitation within the tree-based search, following the insights from stochastic multi-arm bandit (MAB) literature, cf. [1, 3]. In effect, such an approach assumes that the regret of the underlying recursively dependent non-stationary MABs concentrates around their mean exponentially in the number of steps, which is unlikely to hold as pointed out in [2], even for stationary MABs. © 2020 Copyright is held by the owner/author(s).
引用
收藏
页码:31 / 32
页数:1
相关论文
共 50 条
  • [21] Information Set Monte Carlo Tree Search
    Cowling, Peter I.
    Powley, Edward J.
    Whitehouse, Daniel
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (02) : 120 - 143
  • [22] State Aggregation in Monte Carlo Tree Search
    Hostetler, Jesse
    Fern, Alan
    Dietterich, Tom
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2446 - 2452
  • [23] Monte Carlo Tree Search with Robust Exploration
    Imagawa, Takahisa
    Kaneko, Tomoyuki
    COMPUTERS AND GAMES, CG 2016, 2016, 10068 : 34 - 46
  • [24] Multiple Pass Monte Carlo Tree Search
    McGuinness, Cameron
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 1555 - 1561
  • [25] On Monte Carlo Tree Search and Reinforcement Learning
    Vodopivec, Tom
    Samothrakis, Spyridon
    Ster, Branko
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2017, 60 : 881 - 936
  • [26] Learning in POMDPs with Monte Carlo Tree Search
    Katt, Sammie
    Oliehoek, Frans A.
    Amato, Christopher
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [27] Playing Carcassonne with Monte Carlo Tree Search
    Ameneyro, Fred Valdez
    Galvan, Edgar
    Fernando, Angel
    Morales, Kuri
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2343 - 2350
  • [28] Monte Carlo Tree Search for Love Letter
    Omarov, Tamirlan
    Aslam, Hamna
    Brown, Joseph Alexander
    Reading, Elizabeth
    19TH INTERNATIONAL CONFERENCE ON INTELLIGENT GAMES AND SIMULATION (GAME-ON(R) 2018), 2018, : 10 - 15
  • [29] Incentive Learning in Monte Carlo Tree Search
    Kao, Kuo-Yuan
    Wu, I-Chen
    Yen, Shi-Jim
    Shan, Yi-Chang
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2013, 5 (04) : 346 - 352
  • [30] Monte Carlo Tree Search With Reversibility Compression
    Cook, Michael
    2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 556 - 563