Bounded Policy Iteration for Decentralized POMDPs

被引:0
|
作者
Bernstein, Daniel S. [1 ]
Hansen, Eric A.
Zilberstein, Shlomo [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers, which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information during execution, and show that this leads to improved performance. The algorithm uses a fixed amount of memory, and each iteration is guaranteed to produce a controller with value at least as high as the previous one for all possible initial state distributions. For the case of a single agent, the algorithm reduces to Poupart and Boutilier's bounded policy iteration for POMDPs.
引用
收藏
页码:1287 / 1292
页数:6
相关论文
共 50 条
  • [11] Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives
    Wang, Yue
    Chaudhuri, Swarat
    Kavraki, Lydia E.
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 238 - 246
  • [12] Open Decentralized POMDPs
    Cohen, Jonathan
    Dibangoye, Jilles Steeve
    Mouaddib, Abdel-Illah
    2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 977 - 984
  • [13] Forward Search Value Iteration For POMDPs
    Shani, Guy
    Brafman, Ronen I.
    Shimony, Solomon E.
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2619 - 2624
  • [14] Policy Iteration for Decentralized Control of Markov Decision Processes
    Bernstein, Daniel S.
    Amato, Christopher
    Hansen, Eric A.
    Zilberstein, Shlomo
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
  • [15] Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
    Amato, Christopher
    Bernstein, Daniel S.
    Zilberstein, Shlomo
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2010, 21 (03) : 293 - 320
  • [16] Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs
    Christopher Amato
    Daniel S. Bernstein
    Shlomo Zilberstein
    Autonomous Agents and Multi-Agent Systems, 2010, 21 : 293 - 320
  • [17] Policy Graph Pruning and Optimization in Monte Carlo Value Iteration for Continuous-State POMDPs
    Qian, Weisheng
    Liu, Quan
    Zhang, Zongzhang
    Pan, Zhiyuan
    Zhong, Shan
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [18] Point-based value iteration for continuous POMDPs
    Institut de Robòtica i Informàtica Industrial, UPC-CSIC, Llorens i Artigas 4-6, 08028, Barcelona, Spain
    不详
    不详
    J. Mach. Learn. Res., 2006, (2329-2367):
  • [19] Planning with Macro-Actions in Decentralized POMDPs
    Amato, Christopher
    Konidaris, George D.
    Kaelbling, Leslie P.
    AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1273 - 1280
  • [20] Point-based value iteration for continuous POMDPs
    Porta, Josep M.
    Vlassis, Nikos
    Spaan, Matthijs T. J.
    Poupart, Pascal
    JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367