Monte-Carlo Tree Search for Constrained POMDPs

被引:0
|
作者
Lee, Jongmin [1 ]
Kim, Geon-Hyeong [1 ]
Poupart, Pascal [2 ,3 ]
Kim, Kee-Eung [1 ,4 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea
[2] Univ Waterloo, Waterloo AI Inst, Waterloo, ON, Canada
[3] Vector Inst, Toronto, ON, Canada
[4] PROWLER Io, Cambridge, England
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte-Carlo Tree Search (MCTS) has been successfully applied to very large POMDPs, a standard model for stochastic sequential decision-making problems. However, many real-world problems inherently have multiple goals, where multi-objective formulations are more natural. The constrained POMDP (CPOMDP) is such a model that maximizes the reward while constraining the cost, extending the standard POMDP model. To date, solution methods for CPOMDPs assume an explicit model of the environment, and thus are hardly applicable to large-scale real-world problems. In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment. In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Ertel, Wolfgang
    Viet-Hung Dang
    Chung, TaeChoong
    APPLIED INTELLIGENCE, 2013, 39 (02) : 345 - 353
  • [42] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Wolfgang Ertel
    Viet-Hung Dang
    TaeChoong Chung
    Applied Intelligence, 2013, 39 : 345 - 353
  • [43] Using evaluation functions in Monte-Carlo Tree Search
    Lorentz, Richard
    THEORETICAL COMPUTER SCIENCE, 2016, 644 : 106 - 113
  • [44] Backpropagation Modification in Monte-Carlo Game Tree Search
    Xie, Fan
    Liu, Zhiqing
    2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 2, PROCEEDINGS, 2009, : 125 - 128
  • [45] Monte-Carlo Tree Search in Dragline Operation Planning
    Liu, Haoquan
    Austin, Kevin
    Forbes, Michael
    Kearney, Michael
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (01): : 419 - 425
  • [46] Single-Player Monte-Carlo Tree Search
    Schadd, Maarten P. D.
    Winands, Mark H. M.
    van den Herik, H. Jaap
    Chaslot, Guillaume M. J. -B.
    Uiterwijk, Jos W. H. M.
    COMPUTERS AND GAMES, 2008, 5131 : 1 - +
  • [47] Combining Monte-Carlo Tree Search with Proof-Number Search
    Doe, Elliot
    Winands, Mark H. M.
    Soemers, Dennis J. N. J.
    Browne, Cameron
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 206 - 212
  • [48] Online Planning for Interactive-POMDPs using Nested Monte Carlo Tree Search
    Schwartz, Jonathon
    Zhou, Ruijia
    Kurniawati, Hanna
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8770 - 8777
  • [49] Thompson Sampling Based Monte-Carlo Planning in POMDPs
    Bai, Aijun
    Wu, Feng
    Zhang, Zongzhang
    Chen, Xiaoping
    TWENTY-FOURTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING, 2014, : 29 - 37
  • [50] Efficiency of Static Knowledge Bias in Monte-Carlo Tree Search
    Ikeda, Kokolo
    Viennot, Simon
    COMPUTERS AND GAMES, CG 2013, 2014, 8427 : 26 - 38