Forward Search Value Iteration For POMDPs

被引:0
|
作者
Shani, Guy [1 ]
Brafman, Ronen I. [1 ]
Shimony, Solomon E. [1 ]
机构
[1] Ben Gurion Univ Negev, Dept Comp Sci, Beer Sheva, Israel
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent scaling up of POMDP solvers towards realistic applications is largely due to point-based methods which quickly converge to an approximate solution for medium-sized problems. Of this family HSVI, which uses trial-based asynchronous value iteration, can handle the largest domains. In this paper we suggest a new algorithm, FSVI, that uses the underlying MDP to traverse the belief space towards rewards, finding sequences of useful backups, and show how it scales up better than HSVI on larger benchmarks.
引用
收藏
页码:2619 / 2624
页数:6
相关论文
共 50 条
  • [31] Monte-Carlo Tree Search for Constrained POMDPs
    Lee, Jongmin
    Kim, Geon-Hyeong
    Poupart, Pascal
    Kim, Kee-Eung
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [32] Adaptive Online Packing-guided Search for POMDPs
    Wu, Chenyang
    Yang, Guoyu
    Zhang, Zongzhang
    Yu, Yang
    Li, Dong
    Liu, Wulong
    Hao, Jianye
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [33] Monte-Carlo Search for an Equilibrium in Dec-POMDPs
    You, Yang
    Thomas, Vincent
    Colas, Francis
    Buffet, Olivier
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2444 - 2453
  • [34] Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games
    Horak, Karel
    Bosansky, Branislav
    Pechoucek, Michal
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 558 - 564
  • [35] Value Iteration Networks
    Tamar, Aviv
    Wu, Yi
    Thomas, Garrett
    Levine, Sergey
    Abbeel, Pieter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [36] Simulated Annealing Monte Carlo Tree Search for large POMDPs
    Xiong, Kai
    Jiang, Hong
    2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 1, 2014, : 140 - 143
  • [37] Value Iteration Networks
    Tamar, Aviv
    Wu, Yi
    Thomas, Garrett
    Levine, Sergey
    Abbeel, Pieter
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4949 - 4953
  • [38] Robust asset allocation with conditional value at risk using the forward search
    Grossi, Luigi
    Laurini, Fabrizio
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2020, 36 (03) : 335 - 352
  • [39] Optimistic Value Iteration
    Hartmanns, Arnd
    Kaminski, Benjamin Lucien
    COMPUTER AIDED VERIFICATION, PT II, 2020, 12225 : 488 - 511
  • [40] Bisection Value Iteration
    Lu, Jia
    Xu, Ming
    2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 109 - 118