Monte-Carlo Tree Search for Constrained POMDPs

被引：0

作者：

Lee, Jongmin ^{[1
]}

Kim, Geon-Hyeong ^{[1
]}

Poupart, Pascal ^{[2
,3
]}

Kim, Kee-Eung ^{[1
,4
]}

机构：

[1] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea

[2] Univ Waterloo, Waterloo AI Inst, Waterloo, ON, Canada

[3] Vector Inst, Toronto, ON, Canada

[4] PROWLER Io, Cambridge, England

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷

关键词：

MARKOV DECISION-PROCESSES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.

引用

页数：10

共 50 条

[21] Multilevel Monte-Carlo for Solving POMDPs Online
Hoerger, Marcus
Kurniawati, Hanna
Elfes, Alberto
ROBOTICS RESEARCH: THE 19TH INTERNATIONAL SYMPOSIUM ISRR, 2022, 20 : 174 - 190
[22] Automated Machine Learning with Monte-Carlo Tree Search
Rakotoarison, Herilalaina
Schoenauer, Marc
Sebag, Michele
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3296 - 3303
[23] Generalized Mean Estimation in Monte-Carlo Tree Search
Dam, Tuan
Klink, Pascal
D'Eramo, Carlo
Peters, Jan
Pajarinen, Joni
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2397 - 2404
[24] Monte-Carlo tree search as regularized policy optimization
Grill, Jean-Bastien
Altche, Florent
Tang, Yunhao
Hubert, Thomas
Valko, Michal
Antonoglou, Ioannis
Munos, Remi
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[25] Converging to a Player Model In Monte-Carlo Tree Search
Sarratt, Trevor
Pynadath, David V.
Jhala, Arnav
2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2014,
[26] AIs for Dominion Using Monte-Carlo Tree Search
Tollisen, Robin
Jansen, Jon Vegard
Goodwin, Morten
Glimsdal, Sondre
CURRENT APPROACHES IN APPLIED ARTIFICIAL INTELLIGENCE, 2015, 9101 : 43 - 52
[27] Parallel Monte-Carlo Tree Search with Simulation Servers
Kato, Hideki
Takeuchi, Ikuo
INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2010), 2010, : 491 - 498
[28] A SHOGI PROGRAM BASED ON MONTE-CARLO TREE SEARCH
Sato, Yoshikuni
Takahashi, Daisuke
Grimbergen, Reijer
ICGA JOURNAL, 2010, 33 (02) : 80 - 92
[29] CROSS-ENTROPY FOR MONTE-CARLO TREE SEARCH
Chaslot, Guillaume M. J. B.
Winands, Mark H. M.
Szita, Istvan
van den Herik, H. Jaap
ICGA JOURNAL, 2008, 31 (03) : 145 - 156
[30] Monte-Carlo Tree Search Parallelisation for Computer Go
van Niekerk, Francois
Kroon, Steve
van Rooyen, Gert-Jan
Inggs, Cornelia P.
PROCEEDINGS OF THE SOUTH AFRICAN INSTITUTE FOR COMPUTER SCIENTISTS AND INFORMATION TECHNOLOGISTS CONFERENCE, 2012, : 129 - 138

← 1 2 3 4 5 →