Monte-Carlo Tree Search for Constrained POMDPs

被引:0
|
作者
Lee, Jongmin [1 ]
Kim, Geon-Hyeong [1 ]
Poupart, Pascal [2 ,3 ]
Kim, Kee-Eung [1 ,4 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea
[2] Univ Waterloo, Waterloo AI Inst, Waterloo, ON, Canada
[3] Vector Inst, Toronto, ON, Canada
[4] PROWLER Io, Cambridge, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Multilevel Monte-Carlo for Solving POMDPs Online
    Hoerger, Marcus
    Kurniawati, Hanna
    Elfes, Alberto
    ROBOTICS RESEARCH: THE 19TH INTERNATIONAL SYMPOSIUM ISRR, 2022, 20 : 174 - 190
  • [22] Automated Machine Learning with Monte-Carlo Tree Search
    Rakotoarison, Herilalaina
    Schoenauer, Marc
    Sebag, Michele
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3296 - 3303
  • [23] Generalized Mean Estimation in Monte-Carlo Tree Search
    Dam, Tuan
    Klink, Pascal
    D'Eramo, Carlo
    Peters, Jan
    Pajarinen, Joni
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2397 - 2404
  • [24] Monte-Carlo tree search as regularized policy optimization
    Grill, Jean-Bastien
    Altche, Florent
    Tang, Yunhao
    Hubert, Thomas
    Valko, Michal
    Antonoglou, Ioannis
    Munos, Remi
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [25] Converging to a Player Model In Monte-Carlo Tree Search
    Sarratt, Trevor
    Pynadath, David V.
    Jhala, Arnav
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2014,
  • [26] AIs for Dominion Using Monte-Carlo Tree Search
    Tollisen, Robin
    Jansen, Jon Vegard
    Goodwin, Morten
    Glimsdal, Sondre
    CURRENT APPROACHES IN APPLIED ARTIFICIAL INTELLIGENCE, 2015, 9101 : 43 - 52
  • [27] Parallel Monte-Carlo Tree Search with Simulation Servers
    Kato, Hideki
    Takeuchi, Ikuo
    INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2010), 2010, : 491 - 498
  • [28] A SHOGI PROGRAM BASED ON MONTE-CARLO TREE SEARCH
    Sato, Yoshikuni
    Takahashi, Daisuke
    Grimbergen, Reijer
    ICGA JOURNAL, 2010, 33 (02) : 80 - 92
  • [29] CROSS-ENTROPY FOR MONTE-CARLO TREE SEARCH
    Chaslot, Guillaume M. J. B.
    Winands, Mark H. M.
    Szita, Istvan
    van den Herik, H. Jaap
    ICGA JOURNAL, 2008, 31 (03) : 145 - 156
  • [30] Monte-Carlo Tree Search Parallelisation for Computer Go
    van Niekerk, Francois
    Kroon, Steve
    van Rooyen, Gert-Jan
    Inggs, Cornelia P.
    PROCEEDINGS OF THE SOUTH AFRICAN INSTITUTE FOR COMPUTER SCIENTISTS AND INFORMATION TECHNOLOGISTS CONFERENCE, 2012, : 129 - 138