Can Monte-Carlo Tree Search learn to sacrifice?

被引:0
|
作者
Nathan Companez
Aldeida Aleti
机构
[1] Monash University,Faculty of Information Technology
来源
Journal of Heuristics | 2016年 / 22卷
关键词
Monte-Carlo Tree Search; Sacrifice moves; Artificial intelligence; Games;
D O I
暂无
中图分类号
学科分类号
摘要
One of the most basic activities performed by an intelligent agent is deciding what to do next. The decision is usually about selecting the move with the highest expectation, or exploring new scenarios. Monte-Carlo Tree Search (MCTS), which was developed as a game playing agent, deals with this exploration–exploitation ‘dilemma’ using a multi-armed bandits strategy. The success of MCTS in a wide range of problems, such as combinatorial optimisation, reinforcement learning, and games, is due to its ability to rapidly evaluate problem states without requiring domain-specific knowledge. However, it has been acknowledged that the trade-off between exploration and exploitation is crucial for the performance of the algorithm, and affects the efficiency of the agent in learning deceptive states. One type of deception is states that give immediate rewards, but lead to a suboptimal solution in the long run. These states are known as trap states, and have been thoroughly investigated in previous research. In this work, we study the opposite of trap states, known as sacrifice states, which are deceptive moves that result in a local loss but are globally optimal, and investigate the efficiency of MCTS enhancements in identifying this type of moves.
引用
收藏
页码:783 / 813
页数:30
相关论文
共 50 条
  • [31] EXPERIMENTS WITH MONTE-CARLO TREE SEARCH IN THE GAME OF HAVANNAH
    Lorentz, Richard J.
    ICGA JOURNAL, 2011, 34 (03) : 140 - 149
  • [32] Monte-Carlo Tree Search by Best Arm Identification
    Kaufmann, Emilie
    Koolen, Wouter M.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [33] Monte-Carlo Tree Search for Scalable Coalition Formation
    Wu, Feng
    Ramchurn, Sarvapali D.
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 407 - 413
  • [34] Monte-Carlo Tree Search for the Maximum Satisfiability Problem
    Goffinet, Jack
    Ramanujan, Raghuram
    PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, CP 2016, 2016, 9892 : 251 - 267
  • [35] Parallel Monte-Carlo Tree Search for HPC Systems
    Graf, Tobias
    Lorenz, Ulf
    Platzner, Marco
    Schaefers, Lars
    EURO-PAR 2011 PARALLEL PROCESSING, PT 2, 2011, 6853 : 365 - 376
  • [36] Monte-Carlo tree search as regularized policy optimization
    Grill, Jean-Bastien
    Altche, Florent
    Tang, Yunhao
    Hubert, Thomas
    Valko, Michal
    Antonoglou, Ioannis
    Munos, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [37] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Ertel, Wolfgang
    Viet-Hung Dang
    Chung, TaeChoong
    APPLIED INTELLIGENCE, 2013, 39 (02) : 345 - 353
  • [38] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Wolfgang Ertel
    Viet-Hung Dang
    TaeChoong Chung
    Applied Intelligence, 2013, 39 : 345 - 353
  • [39] Using evaluation functions in Monte-Carlo Tree Search
    Lorentz, Richard
    THEORETICAL COMPUTER SCIENCE, 2016, 644 : 106 - 113
  • [40] Backpropagation Modification in Monte-Carlo Game Tree Search
    Xie, Fan
    Liu, Zhiqing
    2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 2, PROCEEDINGS, 2009, : 125 - 128