Hierarchical Monte-Carlo Planning

被引：0

作者：

Ngo Anh Vien ^{[1
]}

Toussaint, Marc ^{[1
]}

机构：

[1] Univ Stuttgart, Machine Learning & Robot Lab, Stuttgart, Germany

来源：

PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2015年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Monte-Carlo Tree Search, especially UCT and its POMDP version POMCP, have demonstrated excellent performance on many problems. However, to efficiently scale to large domains one should also exploit hierarchical structure if present. In such hierarchical domains, finding rewarded states typically requires to search deeply; covering enough such informative states very far from the root becomes computationally expensive in flat non-hierarchical search approaches. We propose novel, scalable MCTS methods which integrate a task hierarchy into the MCTS framework, specifically leading to hierarchical versions of both, UCT and POMCP. The new method does not need to estimate probabilistic models of each subtask, it instead computes subtask policies purely sample-based. We evaluate the hierarchical MCTS methods on various settings such as a hierarchical MDP, a Bayesian model-based hierarchical RL problem, and a large hierarchical POMDP.

引用

页码：3613 / 3619

页数：7

共 50 条

[1] Bandit based Monte-Carlo planning
Kocsis, Levente
Szepesvari, Csaba
[J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 282 - 293
[2] Maximum Entropy Monte-Carlo Planning
Xiao, Chenjun
Mei, Jincheng
Huang, Ruitong
Schuurmans, Dale
Mueller, Martin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[3] Monte-Carlo Exploration for Deterministic Planning
Nakhost, Hootan
Mueller, Martin
[J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1766 - 1771
[4] Monte-Carlo Robot Path Planning
Dam, Tuan
Chalvatzaki, Georgia
Peters, Jan
Pajarinen, Joni
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11213 - 11220
[5] Monte-Carlo Planning for Agile Legged Locomotion
Clary, Patrick
Morais, Pedro
Fern, Alan
Hurst, Jonathan
[J]. TWENTY-EIGHTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING (ICAPS 2018), 2018, : 446 - 450
[6] Faithful Question Answering with Monte-Carlo Planning
Hong, Ruixin
Zhang, Hongming
Zhao, Hong
Yu, Dong
Zhang, Changshui
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3944 - 3965
[7] A MONTE-CARLO SIMULATION OF A PRODUCTION PLANNING PROBLEM
MUSK, FI
[J]. COMPUTER JOURNAL, 1959, 2 (02): : 90 - 94
[8] The parallelization of Monte-Carlo planning - Parallelization of MC-planning
Gelly, S.
Hoock, J. B.
Rimmel, A.
Teytaud, O.
Kalemkarian, Y.
[J]. ICINCO 2008: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL ICSO: INTELLIGENT CONTROL SYSTEMS AND OPTIMIZATION, 2008, : 244 - +
[9] HIERARCHICAL MONTE-CARLO SIMULATION OF THE ISING-MODEL
FAAS, M
HILHORST, HJ
[J]. PHYSICA A, 1986, 135 (2-3): : 571 - 590
[10] Analysis of Path Planning Method based on Monte-Carlo
Zhang Da-qiao
Lei Gang
Xian Yong
Wang Ming-hai
[J]. 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 5, 2010, : 176 - 180

← 1 2 3 4 5 →