Optimal Policies for Quantum Markov Decision Processes

被引：0

作者：

Ming-Sheng Ying

Yuan Feng

Sheng-Gang Ying

机构：

[1] University of Technology,Centre for Quantum Software and Information

[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software

[3] Tsinghua University,Department of Computer Science and Technology

来源：

International Journal of Automation and Computing | 2021年 / 18卷

关键词：

Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

引用

页码：410 / 421

页数：11

共 50 条

[21] Algorithm to identify and compute average optimal policies in multichain Markov decision processes
Leizarowitz, A
MATHEMATICS OF OPERATIONS RESEARCH, 2003, 28 (03) : 553 - 586
[22] Computing optimal stationary policies for multi-objective Markov decision processes
Wiering, Marco A.
de Jong, Edwin D.
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 158 - +
[23] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
Montes-De-Oca, Raul
Lemus-Rodriguez, Enrique
RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
[24] Markov decision processes based optimal control policies for probabilistic boolean networks
Abul, O
Alhajj, R
Polat, F
BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 337 - 344
[25] Least Inferable Policies for Markov Decision Processes
Karabag, Mustafa O.
Ornik, Melkior
Topcu, Ufuk
2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 1224 - 1231
[26] Ranking policies in discrete Markov decision processes
Peng Dai
Judy Goldsmith
Annals of Mathematics and Artificial Intelligence, 2010, 59 : 107 - 123
[27] Ranking policies in discrete Markov decision processes
Dai, Peng
Goldsmith, Judy
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 59 (01) : 107 - 123
[28] Robustness of policies in constrained Markov decision processes
Zadorojniy, A
Shwartz, A
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
[29] AVERAGE OPTIMAL POLICIES IN MARKOV DECISION DRIFT PROCESSES WITH APPLICATIONS TO A QUEUING AND A REPLACEMENT MODEL
HORDIJK, A
SCHOUTEN, FAV
ADVANCES IN APPLIED PROBABILITY, 1983, 15 (02) : 274 - 303
[30] Approximation of average cost optimal policies for general Markov decision processes with unbounded costs
Gordienko, E
Montes-de-Oca, R
Minjarez-Sosa, A
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1997, 45 (02) : 245 - 263

← 1 2 3 4 5 →