Hierarchical Monte-Carlo Planning

被引:0
|
作者
Ngo Anh Vien [1 ]
Toussaint, Marc [1 ]
机构
[1] Univ Stuttgart, Machine Learning & Robot Lab, Stuttgart, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte-Carlo Tree Search, especially UCT and its POMDP version POMCP, have demonstrated excellent performance on many problems. However, to efficiently scale to large domains one should also exploit hierarchical structure if present. In such hierarchical domains, finding rewarded states typically requires to search deeply; covering enough such informative states very far from the root becomes computationally expensive in flat non-hierarchical search approaches. We propose novel, scalable MCTS methods which integrate a task hierarchy into the MCTS framework, specifically leading to hierarchical versions of both, UCT and POMCP. The new method does not need to estimate probabilistic models of each subtask, it instead computes subtask policies purely sample-based. We evaluate the hierarchical MCTS methods on various settings such as a hierarchical MDP, a Bayesian model-based hierarchical RL problem, and a large hierarchical POMDP.
引用
收藏
页码:3613 / 3619
页数:7
相关论文
共 50 条
  • [1] Bandit based Monte-Carlo planning
    Kocsis, Levente
    Szepesvari, Csaba
    [J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 282 - 293
  • [2] Maximum Entropy Monte-Carlo Planning
    Xiao, Chenjun
    Mei, Jincheng
    Huang, Ruitong
    Schuurmans, Dale
    Mueller, Martin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Monte-Carlo Exploration for Deterministic Planning
    Nakhost, Hootan
    Mueller, Martin
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1766 - 1771
  • [4] Monte-Carlo Robot Path Planning
    Dam, Tuan
    Chalvatzaki, Georgia
    Peters, Jan
    Pajarinen, Joni
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11213 - 11220
  • [5] Monte-Carlo Planning for Agile Legged Locomotion
    Clary, Patrick
    Morais, Pedro
    Fern, Alan
    Hurst, Jonathan
    [J]. TWENTY-EIGHTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING (ICAPS 2018), 2018, : 446 - 450
  • [6] Faithful Question Answering with Monte-Carlo Planning
    Hong, Ruixin
    Zhang, Hongming
    Zhao, Hong
    Yu, Dong
    Zhang, Changshui
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3944 - 3965
  • [7] A MONTE-CARLO SIMULATION OF A PRODUCTION PLANNING PROBLEM
    MUSK, FI
    [J]. COMPUTER JOURNAL, 1959, 2 (02): : 90 - 94
  • [8] The parallelization of Monte-Carlo planning - Parallelization of MC-planning
    Gelly, S.
    Hoock, J. B.
    Rimmel, A.
    Teytaud, O.
    Kalemkarian, Y.
    [J]. ICINCO 2008: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL ICSO: INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION, 2008, : 244 - +
  • [9] HIERARCHICAL MONTE-CARLO SIMULATION OF THE ISING-MODEL
    FAAS, M
    HILHORST, HJ
    [J]. PHYSICA A, 1986, 135 (2-3): : 571 - 590
  • [10] Analysis of Path Planning Method based on Monte-Carlo
    Zhang Da-qiao
    Lei Gang
    Xian Yong
    Wang Ming-hai
    [J]. 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 5, 2010, : 176 - 180