Monte Carlo tree search with optimal computing budget allocation

被引:0
|
作者
Li, Yunchuan [1 ,2 ]
Fu, Michael [2 ,3 ]
Xu, Jie [4 ]
机构
[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
[2] Univ Maryland, Syst Res Inst, College Pk, MD 20742 USA
[3] Univ Maryland, Robert H Smith Sch Business, College Pk, MD 20742 USA
[4] George Mason Univ, Dept Syst Engn & Operat Res, Fairfax, VA 22030 USA
基金
美国国家科学基金会;
关键词
EFFICIENCY;
D O I
10.1109/cdc40024.2019.9030099
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze a tree search problem with an underlying Markov decision process, in which the goal is to identify the best action at the root that achieves the highest cumulative reward. We present a new tree policy that optimally allocates a limited computing budget to maximize a lower bound on the probability of correctly selecting the best action at each node. Compared to the widely used Upper Confidence Bound (UCB) type of tree policies, the new tree policy presents a more balanced approach to manage the exploration and exploitation trade-off when the sampling budget is limited. Furthermore, UCB assumes that the support of reward distribution is known, whereas our algorithm relaxes this assumption, and can be applied to game trees with mild modifications. A numerical experiment is conducted to demonstrate the efficiency of our algorithm in selecting the best action at the root.
引用
收藏
页码:6332 / 6337
页数:6
相关论文
共 50 条
  • [41] Playing Carcassonne with Monte Carlo Tree Search
    Ameneyro, Fred Valdez
    Galvan, Edgar
    Fernando, Angel
    Morales, Kuri
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2343 - 2350
  • [42] Monte Carlo Tree Search for Love Letter
    Omarov, Tamirlan
    Aslam, Hamna
    Brown, Joseph Alexander
    Reading, Elizabeth
    19TH INTERNATIONAL CONFERENCE ON INTELLIGENT GAMES AND SIMULATION (GAME-ON(R) 2018), 2018, : 10 - 15
  • [43] Incentive Learning in Monte Carlo Tree Search
    Kao, Kuo-Yuan
    Wu, I-Chen
    Yen, Shi-Jim
    Shan, Yi-Chang
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2013, 5 (04) : 346 - 352
  • [44] Monte Carlo Tree Search With Reversibility Compression
    Cook, Michael
    2021 IEEE CONFERENCE ON GAMES (COG), 2021, : 556 - 563
  • [45] Time Management for Monte Carlo Tree Search
    Baier, Hendrik
    Winands, Mark H. M.
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2016, 8 (03) : 301 - 314
  • [46] Parallel Monte-Carlo Tree Search
    Chaslot, Guillaume M. J. -B.
    Winands, Mark H. M.
    van den Herik, H. Jaap
    COMPUTERS AND GAMES, 2008, 5131 : 60 - +
  • [47] Parallel Monte Carlo Tree Search on GPU
    Rocki, Kamil
    Suda, Reiji
    ELEVENTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (SCAI 2011), 2011, 227 : 80 - 89
  • [48] Monte Carlo Tree Search in Lines of Action
    Winands, Mark H. M.
    Bjornsson, Yngvi
    Saito, Jahn-Takeshi
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2010, 2 (04) : 239 - 250
  • [49] Monte-Carlo Tree Search Solver
    Winands, Mark H. M.
    Bjornsson, Yngvi
    Saito, Jahn-Takeshi
    COMPUTERS AND GAMES, 2008, 5131 : 25 - +
  • [50] Text Matching with Monte Carlo Tree Search
    He, Yixuan
    Tao, Shuchang
    Xu, Jun
    Guo, Jiafeng
    Lan, YanYan
    Cheng, Xueqi
    INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 41 - 52