Model-based reinforcement learning in factored-state MDPs

被引:10
|
作者
Strehl, Alexander L. [1 ]
机构
[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
关键词
D O I
10.1109/ADPRL.2007.368176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of learning in a factored state Markov Decision Process that is structured to allow a compact representation. We show that the well-known algorithm, factored Rmax, performs near-optimally on all but a number of timesteps that is polynomial in the size of the compact representation, which is often exponentially smaller than the number of states. This is equivalent to the result obtained by Kearns and Koller for their DBN-E-3 algorithm, except that we've conducted the analysis in a more general setting. We also extend the results to a new algorithm, factored IE, that uses the Interval Estimation approach to exploration and can be expected to outperform factored Rmax on most domains.
引用
收藏
页码:103 / 110
页数:8
相关论文
共 50 条
  • [1] Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs
    Kroon, Mark
    Whiteson, Shimon
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 324 - 330
  • [2] Efficient reinforcement learning in factored MDPs
    Kearns, M
    Koller, D
    [J]. IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 740 - 747
  • [3] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [4] TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
    Kozlova, Olga
    Sigaud, Olivier
    Meyer, Christophe
    [J]. FROM ANIMALS TO ANIMATS 11, 2010, 6226 : 489 - +
  • [5] Multi-Task Approach to Reinforcement Learning for Factored-State Markov Decision Problems
    Simm, Jaak
    Sugiyama, Masashi
    Hachiya, Hirotaka
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (10) : 2426 - 2437
  • [6] Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions
    Deng, Zihao
    Devic, Siddartha
    Juba, Brendan
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [7] Near-optimal Reinforcement Learning in Factored MDPs
    Osband, Ian
    Van Roy, Benjamin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [8] Exploiting Additive Structure in Factored MDPs for Reinforcement Learning
    Degris, Thomas
    Sigaud, Olivier
    Wuillemin, Pierre-Henri
    [J]. RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 15 - 26
  • [9] Model-based Bayesian Reinforcement Learning in Factored Markov Decision Process
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    [J]. JOURNAL OF COMPUTERS, 2014, 9 (04) : 845 - 850
  • [10] An Associative State-Space Metric for Learning in Factored MDPs
    Sequeira, Pedro
    Melo, Francisco S.
    Paiva, Ana
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2013, 2013, 8154 : 163 - 174