Learning and Planning with Timing Information in Markov Decision Processes

被引:0
|
作者
Bacon, Pierre-Luc [1 ]
Balle, Borja [1 ]
Precup, Doina [1 ]
机构
[1] McGill Univ, Sch Comp Sci, Reasoning & Learning Lab, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of learning and planning in Markov decision processes with temporally extended actions represented in the options framework. We propose to use predictions about the duration of extended actions to represent the state and show that this leads to a compact predictive state representation model independent of the set of primitive actions. Then we develop a consistent and efficient spectral learning algorithm for such models. Using just the timing information to represent states allows for faster improvement in the planning performance. We illustrate our approach with experiments in both synthetic and robot navigation domains.
引用
收藏
页码:111 / 120
页数:10
相关论文
共 50 条
  • [1] Learning and Planning in Average-Reward Markov Decision Processes
    Wan, Yi
    Naik, Abhishek
    Sutton, Richard S.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7665 - 7676
  • [2] Preference Planning for Markov Decision Processes
    Li, Meilun
    She, Zhikun
    Turrini, Andrea
    Zhang, Lijun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3313 - 3319
  • [3] Planning with Abstract Markov Decision Processes
    Gopalan, Nakul
    desJardins, Marie
    Littman, Michael L.
    MacGlashan, James
    Squire, Shawn
    Tellex, Stefanie
    Winder, John
    Wong, Lawson L. S.
    [J]. TWENTY-SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING, 2017, : 480 - 488
  • [4] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
    Ross, Stephane
    Pineau, Joelle
    Chaib-draa, Brahim
    Kreitmann, Pierre
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770
  • [5] Policy Reuse for Learning and Planning in Partially Observable Markov Decision Processes
    Wu, Bo
    Feng, Yanpeng
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, : 549 - 552
  • [6] Learning to Collaborate in Markov Decision Processes
    Radanovic, Goran
    Devidze, Rati
    Parkes, David C.
    Singla, Adish
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Learning in Constrained Markov Decision Processes
    Singh, Rahul
    Gupta, Abhishek
    Shroff, Ness B.
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
  • [8] Multiagent, Multitarget Path Planning in Markov Decision Processes
    Nawaz, Farhad
    Ornik, Melkior
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (12) : 7560 - 7574
  • [9] Approximate planning and verification for large Markov decision processes
    Lassaigne, Richard
    Peyronnet, Sylvain
    [J]. INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER, 2015, 17 (04) : 457 - 467
  • [10] Planning using hierarchical constrained Markov decision processes
    Seyedshams Feyzabadi
    Stefano Carpin
    [J]. Autonomous Robots, 2017, 41 : 1589 - 1607