The Smoothed Complexity of Policy Iteration for Markov Decision Processes

被引：0

作者：

Christ, Miranda ^{[1
]}

Yannakakis, Mihalis ^{[1
]}

机构：

[1] Columbia Univ, New York, NY 10027 USA

来源：

PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023 | 2023年

关键词：

Policy Iteration; Smoothed Analysis; SIMPLEX; ALGORITHMS;

D O I：

10.1145/3564246.3585220

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We showsubexponential lower bounds (i.e., 2(Omega(nc))) on the smoothed complexity of the classical Howard's Policy Iteration algorithm for Markov Decision Processes. The bounds hold for the total reward and the average reward criteria. The constructions are robust in the sense that the subexponential bound holds not only on the average for independent random perturbations of the MDP parameters (transition probabilities and rewards), but for all arbitrary perturbations within an inverse polynomial range. We show also an exponential lower bound on the worst-case complexity for the simple reachability objective.

引用

页码：1890 / 1903

页数：14

共 50 条

[1] The complexity of Policy Iteration is exponential for discounted Markov Decision Processes
Hollanders, Romain
Delvenne, Jean-Charles
Jungers, Raphael M.
[J]. 2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 5997 - 6002
[2] Geometric Policy Iteration for Markov Decision Processes
Wu, Yue
De Loera, Jesus A.
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
[3] Policy set iteration for Markov decision processes
Chang, Hyeong Soo
[J]. AUTOMATICA, 2013, 49 (12) : 3687 - 3689
[4] Efficient Policy Iteration for Periodic Markov Decision Processes
Osogami, Takayuki
Raymond, Rudy
[J]. 21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
[5] Evolutionary policy iteration for solving Markov decision processes
Chang, HS
Lee, HG
Fu, MC
Marcus, SI
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
[6] Policy iteration for robust nonstationary Markov decision processes
Saumya Sinha
Archis Ghate
[J]. Optimization Letters, 2016, 10 : 1613 - 1628
[7] Policy iteration for robust nonstationary Markov decision processes
Sinha, Saumya
Ghate, Archis
[J]. OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
[8] Policy Iteration for Decentralized Control of Markov Decision Processes
Bernstein, Daniel S.
Amato, Christopher
Hansen, Eric A.
Zilberstein, Shlomo
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
[9] Accelerated modified policy iteration algorithms for Markov decision processes
Shlakhter, Oleksandr
Lee, Chi-Guhn
[J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2013, 78 (01) : 61 - 76
[10] Policy Iteration for Parameterized Markov Decision Processes and Its Application
Xia, Li
Jia, Qing-Shan
[J]. 2013 9TH ASIAN CONTROL CONFERENCE (ASCC), 2013,

← 1 2 3 4 5 →