Episodic task learning in Markov decision processes

被引:0
|
作者
Yong Lin
Fillia Makedon
Yurong Xu
机构
[1] Computer Science & Engineering,
[2] Oracle Corporation,undefined
来源
关键词
Optimal Policy; Task State; Markov Decision Process; Belief State; Hierarchical Approach;
D O I
暂无
中图分类号
学科分类号
摘要
Hierarchical algorithms for Markov decision processes have been proved to be useful for the problem domains with multiple subtasks. Although the existing hierarchical approaches are strong in task decomposition, they are weak in task abstraction, which is more important for task analysis and modeling. In this paper, we propose a task-oriented design to strengthen the task abstraction. Our approach learns an episodic task model from the problem domain, with which the planner obtains the same control effect, with concise structure and much improved performance than the original model. According to our analysis and experimental evaluation, our approach has better performance than the existing hierarchical algorithms, such as MAXQ and HEXQ.
引用
收藏
页码:87 / 98
页数:11
相关论文
共 50 条
  • [1] Episodic task learning in Markov decision processes
    Lin, Yong
    Makedon, Fillia
    Xu, Yurong
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2011, 36 (02) : 87 - 98
  • [2] Online Learning with Implicit Exploration in Episodic Markov Decision Processes
    Ghasemi, Mahsa
    Hashemi, Abolfazl
    Vikalo, Haris
    Topcu, Ufuk
    [J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1953 - 1958
  • [3] Differentially Private Regret Minimization in Episodic Markov Decision Processes
    Chowdhury, Sayak Ray
    Zhou, Xingyu
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6375 - 6383
  • [4] Learning to Collaborate in Markov Decision Processes
    Radanovic, Goran
    Devidze, Rati
    Parkes, David C.
    Singla, Adish
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [5] Learning in Constrained Markov Decision Processes
    Singh, Rahul
    Gupta, Abhishek
    Shroff, Ness B.
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
  • [6] Pure Exploration in Episodic Fixed-Horizon Markov Decision Processes
    Putta, Sudeep Raja
    Tulabandhula, Theja
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1703 - 1704
  • [7] Blackwell Online Learning for Markov Decision Processes
    Li, Tao
    Peng, Guanze
    Zhu, Quanyan
    [J]. 2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
  • [8] Online Learning in Kernelized Markov Decision Processes
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [9] Learning Factored Markov Decision Processes with Unawareness
    Innes, Craig
    Lascarides, Alex
    [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 123 - 133
  • [10] Bayesian Learning of Noisy Markov Decision Processes
    Singh, Sumeetpal S.
    Chopin, Nicolas
    Whiteley, Nick
    [J]. ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2013, 23 (01):