Modeling and Planning with Macro-Actions in Decentralized POMDPs

被引:27
|
作者
Amato, Christopher [1 ]
Konidaris, George [2 ]
Kaelbling, Leslie P. [3 ]
How, Jonathan P. [4 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
[2] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
[3] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[4] MIT, Lab Informat & Decis Syst, 77 Massachusetts Ave, Cambridge, MA 02139 USA
基金
美国国家科学基金会;
关键词
FRAMEWORK; MOTION;
D O I
10.1613/jair.1.11418
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized multi-agent decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent's actions are primitive operations lasting exactly one time step. We address the case where each agent has macro-actions: temporally extended actions that may require different amounts of time to execute. We model macro-actions as options in a Dec-POMDP, focusing on actions that depend only on information directly available to the agent during execution. Therefore, we model systems where coordination decisions only occur at the level of deciding which macro-actions to execute. The core technical difficulty in this setting is that the options chosen by each agent no longer terminate at the same time. We extend three leading Dec-POMDP algorithms for policy generation to the macro-action case, and demonstrate their effectiveness in both standard benchmarks and a multi-robot coordination problem. The results show that our new algorithms retain agent coordination while allowing high-quality solutions to be generated for significantly longer horizons and larger state-spaces than previous Dec-POMDP methods. Furthermore, in the multi-robot domain, we show that, in contrast to most existing methods that are specialized to a particular problem class, our approach can synthesize control policies that exploit opportunities for coordination while balancing uncertainty, sensor information, and information about other agents.
引用
收藏
页码:817 / 859
页数:43
相关论文
共 50 条
  • [1] Planning with Macro-Actions in Decentralized POMDPs
    Amato, Christopher
    Konidaris, George D.
    Kaelbling, Leslie P.
    [J]. AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1273 - 1280
  • [2] Approximate planning in POMDPs with macro-actions
    Theocharous, G
    Kaelbling, LP
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 775 - 782
  • [3] Mining useful Macro-actions in Planning
    Castellanos-Paez, Sandra
    Pellier, Damien
    Fiorino, Humbert
    Pesty, Sylvie
    [J]. 2016 THIRD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION (AIPR), 2016,
  • [4] PUMA: Planning under Uncertainty with Macro-Actions
    He, Ruijie
    Brunskill, Emma
    Roy, Nicholas
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1089 - 1095
  • [5] Efficient Planning under Uncertainty with Macro-actions
    He, Ruijie
    Brunskill, Emma
    Roy, Nicholas
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2011, 40 : 523 - 570
  • [6] Enhancing Temporal Planning by Sequential Macro-Actions
    De Bortoli, Marco
    Chrpa, Lukas
    Gebser, Martin
    Steinbauer-Wagner, Gerald
    [J]. LOGICS IN ARTIFICIAL INTELLIGENCE, JELIA 2023, 2023, 14281 : 595 - 604
  • [7] Implicit Learning of Compiled Macro-Actions for Planning
    Newton, M. A. Hakim
    Levine, John
    [J]. ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 323 - 328
  • [8] MAGIC: Learning Macro-Actions for Online POMDP Planning
    Lee, Yiyuan
    Cai, Panpan
    Hsu, David
    [J]. ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [9] Exploiting macro-actions and predicting plan length in planning as satisfiability
    Gerevini, Alfonso Emilio
    Saetti, Alessandro
    Vallati, Mauro
    [J]. AI COMMUNICATIONS, 2015, 28 (02) : 323 - 344
  • [10] Exploiting Macro-actions and Predicting Plan Length in Planning as Satisfiability
    Gerevini, Alfonso Emilio
    Saetti, Alessandro
    Vallati, Mauro
    [J]. AI(STAR)IA 2011: ARTIFICIAL INTELLIGENCE AROUND MAN AND BEYOND, 2011, 6934 : 189 - 200