Approximate policy iteration with a policy language bias: Solving relational markov decision processes

被引:0
|
作者
Fern, Alan [1 ]
Yoon, Sungwook [2 ]
Givan, Robert [2 ]
机构
[1] School of Electrical Engineering and Computer Science, Oregon State University, United States
[2] School of Electrical and Computer Engineering, Purdue University, United States
关键词
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual value-function learning step with a learning step in policy space. This is advantageous in domains where good policies are easier to represent and learn than the corresponding value functions; which is often the case for the relational MDPs we are interested in. In order to apply API to such problems; we introduce a relational policy language and corresponding learner. In addition; we introduce a new bootstrapping routine for goal-based planning domains; based on random walks. Such bootstrapping is necessary for many large relational MDPs; where reward is extremely sparse; as API is ineffective in such domains when initialized with an uninformed policy. Our experiments show that the resulting system is able to find good policies for a number of classical planning domains and their stochastic variants by solving them as extremely large relational MDPs. The experiments also point to some limitations of our approach; suggesting future work. © 2006 AI Access Foundation. All rights reserved;
D O I
暂无
中图分类号
学科分类号
摘要
Journal article (JA)
引用
收藏
页码:75 / 118
相关论文
共 50 条
  • [1] Approximate policy iteration with a policy language bias: Solving relational Markov decision processes
    Fern, A
    Yoon, S
    Givan, R
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2006, 25 : 75 - 118
  • [2] Evolutionary policy iteration for solving Markov decision processes
    Chang, HS
    Lee, HG
    Fu, MC
    Marcus, SI
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
  • [3] Approximate policy iteration with a policy language bias
    Fern, A
    Yoon, S
    Givan, R
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 847 - 854
  • [4] Approximate Policy Iteration for Markov Decision Processes via Quantitative Adaptive Aggregations
    Abate, Alessandro
    Ceska, Milan
    Kwiatkowska, Marta
    AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, ATVA 2016, 2016, 9938 : 13 - 31
  • [5] Geometric Policy Iteration for Markov Decision Processes
    Wu, Yue
    De Loera, Jesus A.
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
  • [6] Policy set iteration for Markov decision processes
    Chang, Hyeong Soo
    AUTOMATICA, 2013, 49 (12) : 3687 - 3689
  • [7] Cosine Policy Iteration for Solving Infinite-Horizon Markov Decision Processes
    Frausto-Solis, Juan
    Santiago, Elizabeth
    Mora-Vargas, Jaime
    MICAI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5845 : 75 - +
  • [8] Efficient Policy Iteration for Periodic Markov Decision Processes
    Osogami, Takayuki
    Raymond, Rudy
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
  • [9] Policy iteration for robust nonstationary Markov decision processes
    Sinha, Saumya
    Ghate, Archis
    OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
  • [10] Policy iteration for robust nonstationary Markov decision processes
    Saumya Sinha
    Archis Ghate
    Optimization Letters, 2016, 10 : 1613 - 1628