Approximate policy iteration with a policy language bias: Solving relational markov decision processes

被引：0

作者：

Fern, Alan ^{[1
]}

Yoon, Sungwook ^{[2
]}

Givan, Robert ^{[2
]}

机构：

[1] School of Electrical Engineering and Computer Science, Oregon State University, United States

[2] School of Electrical and Computer Engineering, Purdue University, United States

来源：

Journal of Artificial Intelligence Research | 1600年 / 25卷

关键词：

We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual value-function learning step with a learning step in policy space. This is advantageous in domains where good policies are easier to represent and learn than the corresponding value functions; which is often the case for the relational MDPs we are interested in. In order to apply API to such problems; we introduce a relational policy language and corresponding learner. In addition; we introduce a new bootstrapping routine for goal-based planning domains; based on random walks. Such bootstrapping is necessary for many large relational MDPs; where reward is extremely sparse; as API is ineffective in such domains when initialized with an uninformed policy. Our experiments show that the resulting system is able to find good policies for a number of classical planning domains and their stochastic variants by solving them as extremely large relational MDPs. The experiments also point to some limitations of our approach; suggesting future work. © 2006 AI Access Foundation. All rights reserved;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Journal article (JA)

引用

页码：75 / 118

共 50 条

[1] Approximate policy iteration with a policy language bias: Solving relational Markov decision processes
Fern, A
Yoon, S
Givan, R
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2006, 25 : 75 - 118
[2] Evolutionary policy iteration for solving Markov decision processes
Chang, HS
Lee, HG
Fu, MC
Marcus, SI
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
[3] Approximate policy iteration with a policy language bias
Fern, A
Yoon, S
Givan, R
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 847 - 854
[4] Approximate Policy Iteration for Markov Decision Processes via Quantitative Adaptive Aggregations
Abate, Alessandro
Ceska, Milan
Kwiatkowska, Marta
AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, ATVA 2016, 2016, 9938 : 13 - 31
[5] Geometric Policy Iteration for Markov Decision Processes
Wu, Yue
De Loera, Jesus A.
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
[6] Policy set iteration for Markov decision processes
Chang, Hyeong Soo
AUTOMATICA, 2013, 49 (12) : 3687 - 3689
[7] Cosine Policy Iteration for Solving Infinite-Horizon Markov Decision Processes
Frausto-Solis, Juan
Santiago, Elizabeth
Mora-Vargas, Jaime
MICAI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5845 : 75 - +
[8] Efficient Policy Iteration for Periodic Markov Decision Processes
Osogami, Takayuki
Raymond, Rudy
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
[9] Policy iteration for robust nonstationary Markov decision processes
Sinha, Saumya
Ghate, Archis
OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
[10] Policy iteration for robust nonstationary Markov decision processes
Saumya Sinha
Archis Ghate
Optimization Letters, 2016, 10 : 1613 - 1628

← 1 2 3 4 5 →