Can Monte-Carlo Tree Search learn to sacrifice?

被引:0
|
作者
Nathan Companez
Aldeida Aleti
机构
[1] Monash University,Faculty of Information Technology
来源
Journal of Heuristics | 2016年 / 22卷
关键词
Monte-Carlo Tree Search; Sacrifice moves; Artificial intelligence; Games;
D O I
暂无
中图分类号
学科分类号
摘要
One of the most basic activities performed by an intelligent agent is deciding what to do next. The decision is usually about selecting the move with the highest expectation, or exploring new scenarios. Monte-Carlo Tree Search (MCTS), which was developed as a game playing agent, deals with this exploration–exploitation ‘dilemma’ using a multi-armed bandits strategy. The success of MCTS in a wide range of problems, such as combinatorial optimisation, reinforcement learning, and games, is due to its ability to rapidly evaluate problem states without requiring domain-specific knowledge. However, it has been acknowledged that the trade-off between exploration and exploitation is crucial for the performance of the algorithm, and affects the efficiency of the agent in learning deceptive states. One type of deception is states that give immediate rewards, but lead to a suboptimal solution in the long run. These states are known as trap states, and have been thoroughly investigated in previous research. In this work, we study the opposite of trap states, known as sacrifice states, which are deceptive moves that result in a local loss but are globally optimal, and investigate the efficiency of MCTS enhancements in identifying this type of moves.
引用
收藏
页码:783 / 813
页数:30
相关论文
共 50 条
  • [41] Monte-Carlo Tree Search in Dragline Operation Planning
    Liu, Haoquan
    Austin, Kevin
    Forbes, Michael
    Kearney, Michael
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (01): : 419 - 425
  • [42] Single-Player Monte-Carlo Tree Search
    Schadd, Maarten P. D.
    Winands, Mark H. M.
    van den Herik, H. Jaap
    Chaslot, Guillaume M. J. -B.
    Uiterwijk, Jos W. H. M.
    COMPUTERS AND GAMES, 2008, 5131 : 1 - +
  • [43] Efficient selectivity and backup operators in Monte-Carlo tree search
    Coulom, Remi
    COMPUTERS AND GAMES, 2007, 4630 : 72 - 83
  • [44] Monte-Carlo tree search for stable structures of planar clusters
    He Chang-Chun
    Liao Ji-Hai
    Yang Xiao-Bao
    ACTA PHYSICA SINICA, 2017, 66 (16)
  • [45] Efficiency of Static Knowledge Bias in Monte-Carlo Tree Search
    Ikeda, Kokolo
    Viennot, Simon
    COMPUTERS AND GAMES, CG 2013, 2014, 8427 : 26 - 38
  • [46] Monte-Carlo Tree Search for the Game of "7Wonders"
    Robilliard, Denis
    Fonlupt, Cyril
    Teytaud, Fabien
    COMPUTER GAMES, CGW 2014, 2014, 504 : 64 - 77
  • [47] αβ-based Play-outs in Monte-Carlo Tree Search
    Winands, Mark H. M.
    Bjornsson, Yngvi
    2011 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2011, : 110 - 117
  • [48] Application of Monte-Carlo Tree Search in a Fighting Game AI
    Yoshida, Shubu
    Ishihara, Makoto
    Miyazaki, Taichi
    Nakagawa, Yuto
    Harada, Tomohiro
    Thawonmas, Ruck
    2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS, 2016,
  • [49] Single-player Monte-Carlo tree search for SameGame
    Schadd, Maarten P. D.
    Winands, Mark H. M.
    Tak, Mandy J. W.
    Uiterwijk, Jos W. H. M.
    KNOWLEDGE-BASED SYSTEMS, 2012, 34 : 3 - 11
  • [50] Consistency Modifications for Automatically Tuned Monte-Carlo Tree Search
    Berthier, Vincent
    Doghmen, Hassen
    Teytaud, Olivier
    LEARNING AND INTELLIGENT OPTIMIZATION, 2010, 6073 : 111 - 124