PUMA: Planning under Uncertainty with Macro-Actions

被引:0
|
作者
He, Ruijie [1 ]
Brunskill, Emma [2 ]
Roy, Nicholas [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Univ Calif Berkeley, Berkeley, CA 94709 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan a different potential action for each future observation can be prohibitively expensive when planning many steps ahead. An efficient solution for planning far into the future in fully observable domains is to use temporally-extended sequences of actions, or "macro-actions." In this paper, we present a POMDP algorithm for planning under uncertainty with macro-actions (PUMA) that automatically constructs and evaluates open-loop macro-actions within forward-search planning, where the planner branches on observations only at the end of each macro-action. Additionally, we show how to incrementally refine the plan over time, resulting in an anytime algorithm that provably converges to an o-optimal policy. In experiments on several large POMDP problems which require a long horizon lookahead, PUMA outperforms existing state-of-the art solvers.
引用
收藏
页码:1089 / 1095
页数:7
相关论文
共 50 条
  • [1] Efficient Planning under Uncertainty with Macro-actions
    He, Ruijie
    Brunskill, Emma
    Roy, Nicholas
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2011, 40 : 523 - 570
  • [2] Mining useful Macro-actions in Planning
    Castellanos-Paez, Sandra
    Pellier, Damien
    Fiorino, Humbert
    Pesty, Sylvie
    [J]. 2016 THIRD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION (AIPR), 2016,
  • [3] Planning with Macro-Actions in Decentralized POMDPs
    Amato, Christopher
    Konidaris, George D.
    Kaelbling, Leslie P.
    [J]. AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1273 - 1280
  • [4] Approximate planning in POMDPs with macro-actions
    Theocharous, G
    Kaelbling, LP
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 775 - 782
  • [5] Enhancing Temporal Planning by Sequential Macro-Actions
    De Bortoli, Marco
    Chrpa, Lukas
    Gebser, Martin
    Steinbauer-Wagner, Gerald
    [J]. LOGICS IN ARTIFICIAL INTELLIGENCE, JELIA 2023, 2023, 14281 : 595 - 604
  • [6] Implicit Learning of Compiled Macro-Actions for Planning
    Newton, M. A. Hakim
    Levine, John
    [J]. ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 323 - 328
  • [7] Modeling and planning with macro-actions in decentralized POMDPs
    Amato, Christopher
    Konidaris, George
    Kaelbling, Leslie P.
    How, Jonathan P.
    [J]. Journal of Artificial Intelligence Research, 2019, 64 : 817 - 859
  • [8] Modeling and Planning with Macro-Actions in Decentralized POMDPs
    Amato, Christopher
    Konidaris, George
    Kaelbling, Leslie P.
    How, Jonathan P.
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2019, 64 : 817 - 859
  • [9] MAGIC: Learning Macro-Actions for Online POMDP Planning
    Lee, Yiyuan
    Cai, Panpan
    Hsu, David
    [J]. ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [10] Exploiting Macro-actions and Predicting Plan Length in Planning as Satisfiability
    Gerevini, Alfonso Emilio
    Saetti, Alessandro
    Vallati, Mauro
    [J]. AI(STAR)IA 2011: ARTIFICIAL INTELLIGENCE AROUND MAN AND BEYOND, 2011, 6934 : 189 - 200