Model-based reinforcement learning in factored-state MDPs

被引：10

作者：

Strehl, Alexander L. ^{[1
]}

机构：

[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA

来源：

2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING | 2007年

关键词：

D O I：

10.1109/ADPRL.2007.368176

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the problem of learning in a factored state Markov Decision Process that is structured to allow a compact representation. We show that the well-known algorithm, factored Rmax, performs near-optimally on all but a number of timesteps that is polynomial in the size of the compact representation, which is often exponentially smaller than the number of states. This is equivalent to the result obtained by Kearns and Koller for their DBN-E-3 algorithm, except that we've conducted the analysis in a more general setting. We also extend the results to a new algorithm, factored IE, that uses the Interval Estimation approach to exploration and can be expected to outperform factored Rmax on most domains.

引用

页码：103 / 110

页数：8

共 50 条

[1] Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs
Kroon, Mark
Whiteson, Shimon
[J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 324 - 330
[2] Efficient reinforcement learning in factored MDPs
Kearns, M
Koller, D
[J]. IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 740 - 747
[3] A Model-based Factored Bayesian Reinforcement Learning Approach
Wu, Bo
Feng, Yanpeng
Zheng, Hongyan
[J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
[4] TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
Kozlova, Olga
Sigaud, Olivier
Meyer, Christophe
[J]. FROM ANIMALS TO ANIMATS 11, 2010, 6226 : 489 - +
[5] Multi-Task Approach to Reinforcement Learning for Factored-State Markov Decision Problems
Simm, Jaak
Sugiyama, Masashi
Hachiya, Hirotaka
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (10) : 2426 - 2437
[6] Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions
Deng, Zihao
Devic, Siddartha
Juba, Brendan
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[7] Near-optimal Reinforcement Learning in Factored MDPs
Osband, Ian
Van Roy, Benjamin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[8] Exploiting Additive Structure in Factored MDPs for Reinforcement Learning
Degris, Thomas
Sigaud, Olivier
Wuillemin, Pierre-Henri
[J]. RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 15 - 26
[9] Model-based Bayesian Reinforcement Learning in Factored Markov Decision Process
Wu, Bo
Feng, Yanpeng
Zheng, Hongyan
[J]. JOURNAL OF COMPUTERS, 2014, 9 (04) : 845 - 850
[10] An Associative State-Space Metric for Learning in Factored MDPs
Sequeira, Pedro
Melo, Francisco S.
Paiva, Ana
[J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2013, 2013, 8154 : 163 - 174

← 1 2 3 4 5 →