Anytime point-based approximations for large POMDPs

被引:197
|
作者
Pineau, Joelle [1 ]
Gordon, Geoffrey
Thrun, Sebastian
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada
[2] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15232 USA
[3] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1613/jair.2078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks.
引用
收藏
页码:335 / 380
页数:46
相关论文
共 50 条
  • [1] A stochastic point-based algorithm for POMDPs
    Laviolette, Francois
    Tobin, Ludovic
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2008, 5032 : 332 - 343
  • [2] Preprocessing for Point-Based Algorithms of POMDPs
    Bian, Ai-Hua
    Wang, Chong-Jun
    Chen, Shi-Fu
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 1, PROCEEDINGS, 2008, : 519 - 522
  • [3] Point-based value iteration for continuous POMDPs
    Porta, Josep M.
    Vlassis, Nikos
    Spaan, Matthijs T. J.
    Poupart, Pascal
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367
  • [4] Point-Based Value Iteration for VAR-POMDPs
    Zheng, Wei
    Lin, Hai
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 7 - 12
  • [5] Point-Based Bounded Policy Iteration for Decentralized POMDPs
    Kim, Youngwook
    Kim, Kee-Eung
    [J]. PRICAI 2010: TRENDS IN ARTIFICIAL INTELLIGENCE, 2010, 6230 : 614 - +
  • [6] Belief selection in point-based planning algorithms for POMDPs
    Izadi, Masoumeh T.
    Precup, Doina
    Azar, Danielle
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 383 - 394
  • [7] Point-Based Planning for Multi-Objective POMDPs
    Roijers, Diederik M.
    Whiteson, Shimon
    Oliehoek, Frans A.
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1666 - 1672
  • [8] Perseus: Randomized point-based value iteration for POMDPs
    Spaan, MTJ
    Vlassis, N
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 195 - 220
  • [9] Point-based Value Iteration for VAR-POMDPs
    Zheng, Wei
    Lin, Hai
    [J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1143 - 1148
  • [10] Point-based Monte Carto Online Planning in POMDPs
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    [J]. ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1388 - 1391