Non-Asymptotic Analysis of Monte Carlo Tree Search

被引:3
|
作者
Shah D. [1 ]
Xie Q. [2 ]
Xu Z. [1 ]
机构
[1] LIDS, MIT, Cambridge, MA
[2] ORIE, Cornell University, Ithaca, NY
来源
Performance Evaluation Review | 2020年 / 48卷 / 01期
关键词
D O I
10.1145/3393691.3394202
中图分类号
学科分类号
摘要
In this work, we consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite-horizon discounted cost Markov Decision Process (MDP) with deterministic transitions. While MCTS is believed to provide an approximate value function for a given state with enough simulations, cf. [5, 6], the claimed proof of this property is incomplete. This is due to the fact that the variant of MCTS, the Upper Confidence Bound for Trees (UCT), analyzed in prior works utilizes "logarithmic"bonus term for balancing exploration and exploitation within the tree-based search, following the insights from stochastic multi-arm bandit (MAB) literature, cf. [1, 3]. In effect, such an approach assumes that the regret of the underlying recursively dependent non-stationary MABs concentrates around their mean exponentially in the number of steps, which is unlikely to hold as pointed out in [2], even for stationary MABs. © 2020 Copyright is held by the owner/author(s).
引用
收藏
页码:31 / 32
页数:1
相关论文
共 50 条
  • [41] Monte-Carlo Tree Search with Tree Shape Control
    Marchenko, Oleksandr I.
    Marchenko, Oleksii O.
    2017 IEEE FIRST UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON), 2017, : 812 - 817
  • [42] Context-Tree Weighting and Bayesian Context Trees: Asymptotic and Non-Asymptotic Justifications
    Kontoyiannis, Ioannis
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (02) : 1204 - 1219
  • [43] Non-asymptotic analysis of tangent space perturbation
    Kaslovsky, Daniel N.
    Meyer, Francois G.
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2014, 3 (02) : 134 - 187
  • [44] Analysis of the Impact of Randomization of Search-Control Parameters in Monte-Carlo Tree Search
    Sironi, Chiara F.
    Winands, Mark H. M.
    Journal of Artificial Intelligence Research, 2021, 72 : 715 - 757
  • [45] Analysis of the Impact of Randomization of Search-Control Parameters in Monte-Carlo Tree Search
    Sironi, Chiara F.
    Winands, Mark H. M.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 72 : 717 - 757
  • [46] Monte Carlo Tree Search Techniques in the Game of Kriegspiel
    Ciancarini, Paolo
    Favini, Gian Piero
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 474 - 479
  • [47] Monte Carlo Tree Search for Scheduling Activity Recognition
    Amer, Mohamed R.
    Todorovic, Sinisa
    Fern, Alan
    Zhu, Song-Chun
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1353 - 1360
  • [48] Transpositions and Move Groups in Monte Carlo Tree Search
    Childs, Benjamin E.
    Brodeur, James H.
    Kocsis, Levente
    2008 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2008, : 389 - +
  • [49] Monte Carlo Tree Search for Priced Timed Automata
    Jensen, Peter Gjol
    Kiviriga, Andrej
    Larsen, Kim Guldstrand
    Nyman, Ulrik
    Mijacika, Adriana
    Mortensen, Jeppe Hoiriis
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2022), 2022, 13479 : 381 - 398
  • [50] Using Local Regression in Monte Carlo Tree Search
    Randrianasolo, Arisoa S.
    Pyeatt, Larry D.
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 500 - 503