Multi-agent active information gathering in discrete and continuous-state decentralized POMDPs by policy graph improvement

被引:10
|
作者
Lauri, Mikko [1 ]
Pajarinen, Joni [2 ,3 ]
Peters, Jan [3 ,4 ]
机构
[1] Univ Hamburg, Dept Informat, Hamburg, Germany
[2] Tampere Univ, Tampere, Finland
[3] Tech Univ Darmstadt, Intelligent Autonomous Syst, Darmstadt, Germany
[4] Max Planck Inst, Tubingen, Germany
基金
欧洲研究理事会;
关键词
Planning under uncertainty; Decentralized POMDP; Information gathering; Active perception; OPTIMIZATION; EXPLORATION;
D O I
10.1007/s10458-020-09467-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest when constant communication cannot be assumed. This is common in tasks involving information gathering with multiple independently operating sensor devices that may operate over large physical distances, such as unmanned aerial vehicles, or in communication limited environments such as in the case of autonomous underwater vehicles. In this paper, we frame the information gathering task as a general decentralized partially observable Markov decision process (Dec-POMDP). The Dec-POMDP is a principled model for co-operative decentralized multi-agent decision-making. An optimal solution of a Dec-POMDP is a set of local policies, one for each agent, which maximizes the expected sum of rewards over time. In contrast to most prior work on Dec-POMDPs, we set the reward as a non-linear function of the agents' state information, for example the negative Shannon entropy. We argue that such reward functions are well-suited for decentralized information gathering problems. We prove that if the reward function is convex, then the finite-horizon value function of the Dec-POMDP is also convex. We propose the first heuristic anytime algorithm for information gathering Dec-POMDPs, and empirically prove its effectiveness by solving discrete problems an order of magnitude larger than previous state-of-the-art. We also propose an extension to continuous-state problems with finite action and observation spaces by employing particle filtering. The effectiveness of the proposed algorithms is verified in domains such as decentralized target tracking, scientific survey planning, and signal source localization.
引用
收藏
页数:44
相关论文
共 14 条
  • [1] Multi-agent active information gathering in discrete and continuous-state decentralized POMDPs by policy graph improvement
    Mikko Lauri
    Joni Pajarinen
    Jan Peters
    [J]. Autonomous Agents and Multi-Agent Systems, 2020, 34
  • [2] Information Gathering in Decentralized POMDPs by Policy Graph Improvement
    Lauri, Mikko
    Pajarinen, Joni
    Peters, Jan
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1143 - 1151
  • [3] Policy Graph Pruning and Optimization in Monte Carlo Value Iteration for Continuous-State POMDPs
    Qian, Weisheng
    Liu, Quan
    Zhang, Zongzhang
    Pan, Zhiyuan
    Zhong, Shan
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [4] Decentralized Coordination of Multi-Agent Systems Based on POMDPs and Consensus for Active Perception
    Peti, Marijana
    Petric, Frano
    Bogdan, Stjepan
    [J]. IEEE ACCESS, 2023, 11 : 52480 - 52491
  • [5] Continuous Foraging and Information Gathering in a Multi-Agent Team
    Liemhetcharat, Somchaya
    Yan, Rui
    Tee, Keng Peng
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1325 - 1333
  • [6] Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions
    Ahmadi, Mohamadreza
    Singletary, Andrew
    Burdick, Joel W.
    Ames, Aaron D.
    [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4797 - 4803
  • [7] Value Functions Factorization With Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients
    Zhou, Hanhan
    Lan, Tian
    Aggarwal, Vaneet
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (05): : 1351 - 1361
  • [8] Multi-Agent system-based decentralized state estimation method for active distribution networks
    Adjerid, Hamza
    Maouche, Amin Riad
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2020, 86
  • [9] learning with policy prediction in continuous state-action multi-agent decision processes
    Farzaneh Ghorbani
    Mohsen Afsharchi
    Vali Derhami
    [J]. Soft Computing, 2020, 24 : 901 - 918
  • [10] learning with policy prediction in continuous state-action multi-agent decision processes
    Ghorbani, Farzaneh
    Afsharchi, Mohsen
    Derhami, Vali
    [J]. SOFT COMPUTING, 2020, 24 (02) : 901 - 918