Monte Carlo tree search with optimal computing budget allocation

被引:0
|
作者
Li, Yunchuan [1 ,2 ]
Fu, Michael [2 ,3 ]
Xu, Jie [4 ]
机构
[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
[2] Univ Maryland, Syst Res Inst, College Pk, MD 20742 USA
[3] Univ Maryland, Robert H Smith Sch Business, College Pk, MD 20742 USA
[4] George Mason Univ, Dept Syst Engn & Operat Res, Fairfax, VA 22030 USA
基金
美国国家科学基金会;
关键词
EFFICIENCY;
D O I
10.1109/cdc40024.2019.9030099
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze a tree search problem with an underlying Markov decision process, in which the goal is to identify the best action at the root that achieves the highest cumulative reward. We present a new tree policy that optimally allocates a limited computing budget to maximize a lower bound on the probability of correctly selecting the best action at each node. Compared to the widely used Upper Confidence Bound (UCB) type of tree policies, the new tree policy presents a more balanced approach to manage the exploration and exploitation trade-off when the sampling budget is limited. Furthermore, UCB assumes that the support of reward distribution is known, whereas our algorithm relaxes this assumption, and can be applied to game trees with mild modifications. A numerical experiment is conducted to demonstrate the efficiency of our algorithm in selecting the best action at the root.
引用
收藏
页码:6332 / 6337
页数:6
相关论文
共 50 条
  • [31] LinUCB applied to Monte Carlo tree search
    Mandai, Yusaku
    Kaneko, Tomoyuki
    THEORETICAL COMPUTER SCIENCE, 2016, 644 : 114 - 126
  • [32] Monte Carlo Tree Search for Trading and Hedging
    Vittori, Edoardo
    Likmeta, Amarildo
    Restelli, Marcello
    ICAIF 2021: THE SECOND ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, 2021,
  • [33] A Survey of Monte Carlo Tree Search Methods
    Browne, Cameron B.
    Powley, Edward
    Whitehouse, Daniel
    Lucas, Simon M.
    Cowling, Peter I.
    Rohlfshagen, Philipp
    Tavener, Stephen
    Perez, Diego
    Samothrakis, Spyridon
    Colton, Simon
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (01) : 1 - 43
  • [34] Nonasymptotic Analysis of Monte Carlo Tree Search
    Shah, Devavrat
    Xie, Qiaomin
    Xu, Zhi
    OPERATIONS RESEARCH, 2022, 70 (06) : 3234 - 3260
  • [35] Information Set Monte Carlo Tree Search
    Cowling, Peter I.
    Powley, Edward J.
    Whitehouse, Daniel
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2012, 4 (02) : 120 - 143
  • [36] State Aggregation in Monte Carlo Tree Search
    Hostetler, Jesse
    Fern, Alan
    Dietterich, Tom
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2446 - 2452
  • [37] Monte Carlo Tree Search with Robust Exploration
    Imagawa, Takahisa
    Kaneko, Tomoyuki
    COMPUTERS AND GAMES, CG 2016, 2016, 10068 : 34 - 46
  • [38] Multiple Pass Monte Carlo Tree Search
    McGuinness, Cameron
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 1555 - 1561
  • [39] On Monte Carlo Tree Search and Reinforcement Learning
    Vodopivec, Tom
    Samothrakis, Spyridon
    Ster, Branko
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2017, 60 : 881 - 936
  • [40] Learning in POMDPs with Monte Carlo Tree Search
    Katt, Sammie
    Oliehoek, Frans A.
    Amato, Christopher
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70