Anytime point-based approximations for large POMDPs

被引：197

作者：

Pineau, Joelle ^{[1
]}

Gordon, Geoffrey

Thrun, Sebastian

机构：

[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada

[2] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15232 USA

[3] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

来源：

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH | 2006年 / 27卷

关键词：

D O I：

10.1613/jair.2078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks.

引用

页码：335 / 380

页数：46

共 50 条

[1] A stochastic point-based algorithm for POMDPs
Laviolette, Francois
Tobin, Ludovic
[J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2008, 5032 : 332 - 343
[2] Preprocessing for Point-Based Algorithms of POMDPs
Bian, Ai-Hua
Wang, Chong-Jun
Chen, Shi-Fu
[J]. 20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 1, PROCEEDINGS, 2008, : 519 - 522
[3] Point-based value iteration for continuous POMDPs
Porta, Josep M.
Vlassis, Nikos
Spaan, Matthijs T. J.
Poupart, Pascal
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367
[4] Point-Based Value Iteration for VAR-POMDPs
Zheng, Wei
Lin, Hai
[J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 7 - 12
[5] Point-Based Bounded Policy Iteration for Decentralized POMDPs
Kim, Youngwook
Kim, Kee-Eung
[J]. PRICAI 2010: TRENDS IN ARTIFICIAL INTELLIGENCE, 2010, 6230 : 614 - +
[6] Belief selection in point-based planning algorithms for POMDPs
Izadi, Masoumeh T.
Precup, Doina
Azar, Danielle
[J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 383 - 394
[7] Point-Based Planning for Multi-Objective POMDPs
Roijers, Diederik M.
Whiteson, Shimon
Oliehoek, Frans A.
[J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1666 - 1672
[8] Perseus: Randomized point-based value iteration for POMDPs
Spaan, MTJ
Vlassis, N
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 195 - 220
[9] Point-based Value Iteration for VAR-POMDPs
Zheng, Wei
Lin, Hai
[J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1143 - 1148
[10] Point-based Monte Carto Online Planning in POMDPs
Wu, Bo
Feng, Yanpeng
Zheng, Hongyan
[J]. ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1388 - 1391

← 1 2 3 4 5 →