Can Monte-Carlo Tree Search learn to sacrifice?

被引:0
|
作者
Nathan Companez
Aldeida Aleti
机构
[1] Monash University,Faculty of Information Technology
来源
Journal of Heuristics | 2016年 / 22卷
关键词
Monte-Carlo Tree Search; Sacrifice moves; Artificial intelligence; Games;
D O I
暂无
中图分类号
学科分类号
摘要
One of the most basic activities performed by an intelligent agent is deciding what to do next. The decision is usually about selecting the move with the highest expectation, or exploring new scenarios. Monte-Carlo Tree Search (MCTS), which was developed as a game playing agent, deals with this exploration–exploitation ‘dilemma’ using a multi-armed bandits strategy. The success of MCTS in a wide range of problems, such as combinatorial optimisation, reinforcement learning, and games, is due to its ability to rapidly evaluate problem states without requiring domain-specific knowledge. However, it has been acknowledged that the trade-off between exploration and exploitation is crucial for the performance of the algorithm, and affects the efficiency of the agent in learning deceptive states. One type of deception is states that give immediate rewards, but lead to a suboptimal solution in the long run. These states are known as trap states, and have been thoroughly investigated in previous research. In this work, we study the opposite of trap states, known as sacrifice states, which are deceptive moves that result in a local loss but are globally optimal, and investigate the efficiency of MCTS enhancements in identifying this type of moves.
引用
下载
收藏
页码:783 / 813
页数:30
相关论文
共 50 条
  • [1] Can Monte-Carlo Tree Search learn to sacrifice?
    Companez, Nathan
    Aleti, Aldeida
    JOURNAL OF HEURISTICS, 2016, 22 (06) : 783 - 813
  • [2] Monte-Carlo Tree Search for Logistics
    Edelkamp, Stefan
    Gath, Max
    Greulich, Christoph
    Humann, Malte
    Herzog, Otthein
    Lawo, Michael
    COMMERCIAL TRANSPORT, 2016, : 427 - 440
  • [3] Monte-Carlo Tree Search Solver
    Winands, Mark H. M.
    Bjornsson, Yngvi
    Saito, Jahn-Takeshi
    COMPUTERS AND GAMES, 2008, 5131 : 25 - +
  • [4] Parallel Monte-Carlo Tree Search
    Chaslot, Guillaume M. J. -B.
    Winands, Mark H. M.
    van den Herik, H. Jaap
    COMPUTERS AND GAMES, 2008, 5131 : 60 - +
  • [5] Monte-Carlo Tree Search with Tree Shape Control
    Marchenko, Oleksandr I.
    Marchenko, Oleksii O.
    2017 IEEE FIRST UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON), 2017, : 812 - 817
  • [6] Monte-Carlo Tree Search: To MC or to DP?
    Feldman, Zohar
    Domshlak, Carmel
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 321 - 326
  • [7] Monte-Carlo Tree Search for Constrained POMDPs
    Lee, Jongmin
    Kim, Geon-Hyeong
    Poupart, Pascal
    Kim, Kee-Eung
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [8] Monte-Carlo Tree Search in Settlers of Catan
    Szita, Istvan
    Chaslot, Guillaume
    Spronck, Pieter
    ADVANCES IN COMPUTER GAMES, 2010, 6048 : 21 - +
  • [9] Monte-Carlo Tree Search for Policy Optimization
    Ma, Xiaobai
    Driggs-Campbell, Katherine
    Zhang, Zongzhang
    Kochenderfer, Mykel J.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3116 - 3122
  • [10] Scalability and Parallelization of Monte-Carlo Tree Search
    Bourki, Amine
    Chaslot, Guillaume
    Coulm, Matthieu
    Danjean, Vincent
    Doghmen, Hassen
    Hoock, Jean-Baptiste
    Herault, Thomas
    Rimmel, Arpad
    Teytaud, Fabien
    Teytaud, Olivier
    Vayssiere, Paul
    Yu, Ziqin
    COMPUTERS AND GAMES, 2011, 6515 : 48 - 58