Bounded Policy Iteration for Decentralized POMDPs

被引:0
|
作者
Bernstein, Daniel S. [1 ]
Hansen, Eric A.
Zilberstein, Shlomo [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers, which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information during execution, and show that this leads to improved performance. The algorithm uses a fixed amount of memory, and each iteration is guaranteed to produce a controller with value at least as high as the previous one for all possible initial state distributions. For the case of a single agent, the algorithm reduces to Poupart and Boutilier's bounded policy iteration for POMDPs.
引用
收藏
页码:1287 / 1292
页数:6
相关论文
共 50 条
  • [1] Point-Based Bounded Policy Iteration for Decentralized POMDPs
    Kim, Youngwook
    Kim, Kee-Eung
    PRICAI 2010: TRENDS IN ARTIFICIAL INTELLIGENCE, 2010, 6230 : 614 - +
  • [2] Policy iteration for bounded-parameter POMDPs
    Yaodong Ni
    Zhi-Qiang Liu
    Soft Computing, 2013, 17 : 537 - 548
  • [3] Policy iteration for bounded-parameter POMDPs
    Ni, Yaodong
    Liu, Zhi-Qiang
    SOFT COMPUTING, 2013, 17 (04) : 537 - 548
  • [4] Privacy-Preserving Policy Iteration for Decentralized POMDPs
    Wu, Feng
    Zilberstein, Shlomo
    Chen, Xiaoping
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4759 - 4766
  • [5] Scalable solutions of interactive POMDPs using generalized and bounded policy iteration
    Ekhlas Sonu
    Prashant Doshi
    Autonomous Agents and Multi-Agent Systems, 2015, 29 : 455 - 494
  • [6] Scalable solutions of interactive POMDPs using generalized and bounded policy iteration
    Sonu, Ekhlas
    Doshi, Prashant
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2015, 29 (03) : 455 - 494
  • [7] Policy Evaluation in Decentralized POMDPs With Belief Sharing
    Kayaalp, Mert
    Ghadieh, Fatima
    Sayed, Ali H.
    IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2023, 2 : 125 - 145
  • [8] Information Gathering in Decentralized POMDPs by Policy Graph Improvement
    Lauri, Mikko
    Pajarinen, Joni
    Peters, Jan
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1143 - 1151
  • [9] The Cross-Entropy Method for Policy Search in Decentralized POMDPs
    Oliehoek, Frans A.
    Kooij, Julian F. P.
    Vlassis, Nikos
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2008, 32 (04): : 341 - 357
  • [10] Sample-Based Policy Iteration for Constrained DEC-POMDPs
    Wu, Feng
    Jennings, Nicholas R.
    Chen, Xiaoping
    20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 858 - +