Episodic task learning in Markov decision processes

被引：0

作者：

Yong Lin

Fillia Makedon

Yurong Xu

机构：

[1] Computer Science & Engineering,

[2] Oracle Corporation,undefined

来源：

Artificial Intelligence Review | 2011年 / 36卷

关键词：

Optimal Policy; Task State; Markov Decision Process; Belief State; Hierarchical Approach;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Hierarchical algorithms for Markov decision processes have been proved to be useful for the problem domains with multiple subtasks. Although the existing hierarchical approaches are strong in task decomposition, they are weak in task abstraction, which is more important for task analysis and modeling. In this paper, we propose a task-oriented design to strengthen the task abstraction. Our approach learns an episodic task model from the problem domain, with which the planner obtains the same control effect, with concise structure and much improved performance than the original model. According to our analysis and experimental evaluation, our approach has better performance than the existing hierarchical algorithms, such as MAXQ and HEXQ.

引用

页码：87 / 98

页数：11

共 50 条

[41] Counterexample Explanation by Learning Small Strategies in Markov Decision Processes
Brazdil, Tomas
Chatterjee, Krishnendu
Chmelik, Martin
Fellner, Andreas
Kretinsky, Jan
[J]. COMPUTER AIDED VERIFICATION, PT I, 2015, 9206 : 158 - 177
[42] REINFORCEMENT LEARNING OF NON-MARKOV DECISION-PROCESSES
WHITEHEAD, SD
LIN, LJ
[J]. ARTIFICIAL INTELLIGENCE, 1995, 73 (1-2) : 271 - 306
[43] Active learning of dynamic Bayesian networks in Markov decision processes
Jonsson, Anders
Barto, Andrew
[J]. ABSTRACTION, REFORMULATION, AND APPROXIMATION, PROCEEDINGS, 2007, 4612 : 273 - +
[44] Learning deterministic policies in partially observable Markov decision processes
Miyazaki, K
Kobayashi, S
[J]. INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
[45] PAC learning for Markov decision processes and dynamic. games
Jain, R
Varaiya, PP
[J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2004, : 468 - 468
[46] Permissive Supervisor Synthesis for Markov Decision Processes Through Learning
Wu, Bo
Zhang, Xiaobin
Lin, Hai
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (08) : 3332 - 3338
[47] Learning Representation and Control in Markov Decision Processes: New Frontiers
Mahadevan, Sridhar
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 1 (04): : 403 - 565
[48] Online Learning in Markov Decision Processes with Changing Cost Sequences
Dick, Travis
Gyorgy, Andras
Szepesvari, Csaba
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 1), 2014, 32
[49] Learning and Planning in Average-Reward Markov Decision Processes
Wan, Yi
Naik, Abhishek
Sutton, Richard S.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7665 - 7676
[50] From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Xi-Ren Cao
[J]. Discrete Event Dynamic Systems, 2003, 13 : 9 - 39

← 1 2 3 4 5 →