Bounded Policy Iteration for Decentralized POMDPs

被引：0

作者：

Bernstein, Daniel S. ^{[1
]}

Hansen, Eric A.

Zilberstein, Shlomo ^{[1
]}

机构：

[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA

来源：

19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05) | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers, which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information during execution, and show that this leads to improved performance. The algorithm uses a fixed amount of memory, and each iteration is guaranteed to produce a controller with value at least as high as the previous one for all possible initial state distributions. For the case of a single agent, the algorithm reduces to Poupart and Boutilier's bounded policy iteration for POMDPs.

引用

页码：1287 / 1292

页数：6

共 50 条

[11] Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives
Wang, Yue
Chaudhuri, Swarat
Kavraki, Lydia E.
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 238 - 246
[12] Open Decentralized POMDPs
Cohen, Jonathan
Dibangoye, Jilles Steeve
Mouaddib, Abdel-Illah
2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 977 - 984
[13] Forward Search Value Iteration For POMDPs
Shani, Guy
Brafman, Ronen I.
Shimony, Solomon E.
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2619 - 2624
[14] Policy Iteration for Decentralized Control of Markov Decision Processes
Bernstein, Daniel S.
Amato, Christopher
Hansen, Eric A.
Zilberstein, Shlomo
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
[15] Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
Amato, Christopher
Bernstein, Daniel S.
Zilberstein, Shlomo
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2010, 21 (03) : 293 - 320
[16] Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
Christopher Amato
Daniel S. Bernstein
Shlomo Zilberstein
Autonomous Agents and Multi-Agent Systems, 2010, 21 : 293 - 320
[17] Policy Graph Pruning and Optimization in Monte Carlo Value Iteration for Continuous-State POMDPs
Qian, Weisheng
Liu, Quan
Zhang, Zongzhang
Pan, Zhiyuan
Zhong, Shan
PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
[18] Point-based value iteration for continuous POMDPs
Institut de Robòtica i Informàtica Industrial, UPC-CSIC, Llorens i Artigas 4-6, 08028, Barcelona, Spain
不详
不详
J. Mach. Learn. Res., 2006, (2329-2367):
[19] Planning with Macro-Actions in Decentralized POMDPs
Amato, Christopher
Konidaris, George D.
Kaelbling, Leslie P.
AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1273 - 1280
[20] Point-based value iteration for continuous POMDPs
Porta, Josep M.
Vlassis, Nikos
Spaan, Matthijs T. J.
Poupart, Pascal
JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367

← 1 2 3 4 5 →