Non-Asymptotic Analysis of Monte Carlo Tree Search

被引:3
|
作者
Shah D. [1 ]
Xie Q. [2 ]
Xu Z. [1 ]
机构
[1] LIDS, MIT, Cambridge, MA
[2] ORIE, Cornell University, Ithaca, NY
来源
Performance Evaluation Review | 2020年 / 48卷 / 01期
关键词
D O I
10.1145/3393691.3394202
中图分类号
学科分类号
摘要
In this work, we consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite-horizon discounted cost Markov Decision Process (MDP) with deterministic transitions. While MCTS is believed to provide an approximate value function for a given state with enough simulations, cf. [5, 6], the claimed proof of this property is incomplete. This is due to the fact that the variant of MCTS, the Upper Confidence Bound for Trees (UCT), analyzed in prior works utilizes "logarithmic"bonus term for balancing exploration and exploitation within the tree-based search, following the insights from stochastic multi-arm bandit (MAB) literature, cf. [1, 3]. In effect, such an approach assumes that the regret of the underlying recursively dependent non-stationary MABs concentrates around their mean exponentially in the number of steps, which is unlikely to hold as pointed out in [2], even for stationary MABs. © 2020 Copyright is held by the owner/author(s).
引用
收藏
页码:31 / 32
页数:1
相关论文
共 50 条
  • [1] Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization
    Thanh Huy Nguyen
    Simsekli, Umut
    Richard, Gael
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [2] An Analysis of Monte Carlo Tree Search
    James, Steven
    Konidaris, George
    Rosman, Benjamin
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3576 - 3582
  • [3] Nonasymptotic Analysis of Monte Carlo Tree Search
    Shah, Devavrat
    Xie, Qiaomin
    Xu, Zhi
    OPERATIONS RESEARCH, 2022, 70 (06) : 3234 - 3260
  • [4] POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis
    Mao, Weichao
    Zhang, Kaiqing
    Xie, Qiaomin
    Basar, Tamer
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] Multiagent Monte Carlo Tree Search
    Zerbel, Nicholas
    Yliniemi, Logan
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2309 - 2311
  • [6] Monte Carlo Tree Search with Metaheuristics
    Mandziuk, Jacek
    Walczak, Patryk
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2023, PT II, 2023, 14126 : 134 - 144
  • [7] Elastic Monte Carlo Tree Search
    Xu, Linjie
    Dockhorn, Alexander
    Perez-Liebana, Diego
    IEEE TRANSACTIONS ON GAMES, 2023, 15 (04) : 527 - 537
  • [8] Monte Carlo Tree Search in Hex
    Arneson, Broderick
    Hayward, Ryan B.
    Henderson, Philip
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2010, 2 (04) : 251 - 258
  • [9] Monte Carlo tree search in Kriegspiel
    Ciancarini, Paolo
    Favini, Gian Piero
    ARTIFICIAL INTELLIGENCE, 2010, 174 (11) : 670 - 684
  • [10] MONTE CARLO TREE SEARCH: A TUTORIAL
    Fu, Michael C.
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 222 - 236