Monte-Carlo Tree Search for Constrained POMDPs

被引:0
|
作者
Lee, Jongmin [1 ]
Kim, Geon-Hyeong [1 ]
Poupart, Pascal [2 ,3 ]
Kim, Kee-Eung [1 ,4 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea
[2] Univ Waterloo, Waterloo AI Inst, Waterloo, ON, Canada
[3] Vector Inst, Toronto, ON, Canada
[4] PROWLER Io, Cambridge, England
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Learning in POMDPs with Monte Carlo Tree Search
    Katt, Sammie
    Oliehoek, Frans A.
    Amato, Christopher
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Monte-Carlo Search for an Equilibrium in Dec-POMDPs
    You, Yang
    Thomas, Vincent
    Colas, Francis
    Buffet, Olivier
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2444 - 2453
  • [3] Simulated Annealing Monte Carlo Tree Search for large POMDPs
    Xiong, Kai
    Jiang, Hong
    2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 1, 2014, : 140 - 143
  • [4] Monte-Carlo Tree Search for Logistics
    Edelkamp, Stefan
    Gath, Max
    Greulich, Christoph
    Humann, Malte
    Herzog, Otthein
    Lawo, Michael
    COMMERCIAL TRANSPORT, 2016, : 427 - 440
  • [5] Monte-Carlo Tree Search Solver
    Winands, Mark H. M.
    Bjornsson, Yngvi
    Saito, Jahn-Takeshi
    COMPUTERS AND GAMES, 2008, 5131 : 25 - +
  • [6] Parallel Monte-Carlo Tree Search
    Chaslot, Guillaume M. J. -B.
    Winands, Mark H. M.
    van den Herik, H. Jaap
    COMPUTERS AND GAMES, 2008, 5131 : 60 - +
  • [7] Monte-Carlo Tree Search with Tree Shape Control
    Marchenko, Oleksandr I.
    Marchenko, Oleksii O.
    2017 IEEE FIRST UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON), 2017, : 812 - 817
  • [8] Monte-Carlo Tree Search: To MC or to DP?
    Feldman, Zohar
    Domshlak, Carmel
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 321 - 326
  • [9] Monte-Carlo Tree Search in Settlers of Catan
    Szita, Istvan
    Chaslot, Guillaume
    Spronck, Pieter
    ADVANCES IN COMPUTER GAMES, 2010, 6048 : 21 - +
  • [10] Monte-Carlo Tree Search for Policy Optimization
    Ma, Xiaobai
    Driggs-Campbell, Katherine
    Zhang, Zongzhang
    Kochenderfer, Mykel J.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3116 - 3122