Episodic task learning in Markov decision processes

被引:0
|
作者
Yong Lin
Fillia Makedon
Yurong Xu
机构
[1] Computer Science & Engineering,
[2] Oracle Corporation,undefined
来源
关键词
Optimal Policy; Task State; Markov Decision Process; Belief State; Hierarchical Approach;
D O I
暂无
中图分类号
学科分类号
摘要
Hierarchical algorithms for Markov decision processes have been proved to be useful for the problem domains with multiple subtasks. Although the existing hierarchical approaches are strong in task decomposition, they are weak in task abstraction, which is more important for task analysis and modeling. In this paper, we propose a task-oriented design to strengthen the task abstraction. Our approach learns an episodic task model from the problem domain, with which the planner obtains the same control effect, with concise structure and much improved performance than the original model. According to our analysis and experimental evaluation, our approach has better performance than the existing hierarchical algorithms, such as MAXQ and HEXQ.
引用
收藏
页码:87 / 98
页数:11
相关论文
共 50 条
  • [41] Counterexample Explanation by Learning Small Strategies in Markov Decision Processes
    Brazdil, Tomas
    Chatterjee, Krishnendu
    Chmelik, Martin
    Fellner, Andreas
    Kretinsky, Jan
    [J]. COMPUTER AIDED VERIFICATION, PT I, 2015, 9206 : 158 - 177
  • [42] REINFORCEMENT LEARNING OF NON-MARKOV DECISION-PROCESSES
    WHITEHEAD, SD
    LIN, LJ
    [J]. ARTIFICIAL INTELLIGENCE, 1995, 73 (1-2) : 271 - 306
  • [43] Active learning of dynamic Bayesian networks in Markov decision processes
    Jonsson, Anders
    Barto, Andrew
    [J]. ABSTRACTION, REFORMULATION, AND APPROXIMATION, PROCEEDINGS, 2007, 4612 : 273 - +
  • [44] Learning deterministic policies in partially observable Markov decision processes
    Miyazaki, K
    Kobayashi, S
    [J]. INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
  • [45] PAC learning for Markov decision processes and dynamic. games
    Jain, R
    Varaiya, PP
    [J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2004, : 468 - 468
  • [46] Permissive Supervisor Synthesis for Markov Decision Processes Through Learning
    Wu, Bo
    Zhang, Xiaobin
    Lin, Hai
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (08) : 3332 - 3338
  • [47] Learning Representation and Control in Markov Decision Processes: New Frontiers
    Mahadevan, Sridhar
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 1 (04): : 403 - 565
  • [48] Online Learning in Markov Decision Processes with Changing Cost Sequences
    Dick, Travis
    Gyorgy, Andras
    Szepesvari, Csaba
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 1), 2014, 32
  • [49] Learning and Planning in Average-Reward Markov Decision Processes
    Wan, Yi
    Naik, Abhishek
    Sutton, Richard S.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7665 - 7676
  • [50] From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
    Xi-Ren Cao
    [J]. Discrete Event Dynamic Systems, 2003, 13 : 9 - 39