Optimal Policies for Quantum Markov Decision Processes

被引:0
|
作者
Ming-Sheng Ying
Yuan Feng
Sheng-Gang Ying
机构
[1] University of Technology,Centre for Quantum Software and Information
[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software
[3] Tsinghua University,Department of Computer Science and Technology
关键词
Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;
D O I
暂无
中图分类号
学科分类号
摘要
Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
引用
收藏
页码:410 / 421
页数:11
相关论文
共 50 条
  • [21] Algorithm to identify and compute average optimal policies in multichain Markov decision processes
    Leizarowitz, A
    MATHEMATICS OF OPERATIONS RESEARCH, 2003, 28 (03) : 553 - 586
  • [22] Computing optimal stationary policies for multi-objective Markov decision processes
    Wiering, Marco A.
    de Jong, Edwin D.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 158 - +
  • [23] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
    Montes-De-Oca, Raul
    Lemus-Rodriguez, Enrique
    RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
  • [24] Markov decision processes based optimal control policies for probabilistic boolean networks
    Abul, O
    Alhajj, R
    Polat, F
    BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 337 - 344
  • [25] Least Inferable Policies for Markov Decision Processes
    Karabag, Mustafa O.
    Ornik, Melkior
    Topcu, Ufuk
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 1224 - 1231
  • [26] Ranking policies in discrete Markov decision processes
    Peng Dai
    Judy Goldsmith
    Annals of Mathematics and Artificial Intelligence, 2010, 59 : 107 - 123
  • [27] Ranking policies in discrete Markov decision processes
    Dai, Peng
    Goldsmith, Judy
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 59 (01) : 107 - 123
  • [28] Robustness of policies in constrained Markov decision processes
    Zadorojniy, A
    Shwartz, A
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
  • [29] AVERAGE OPTIMAL POLICIES IN MARKOV DECISION DRIFT PROCESSES WITH APPLICATIONS TO A QUEUING AND A REPLACEMENT MODEL
    HORDIJK, A
    SCHOUTEN, FAV
    ADVANCES IN APPLIED PROBABILITY, 1983, 15 (02) : 274 - 303
  • [30] Approximation of average cost optimal policies for general Markov decision processes with unbounded costs
    Gordienko, E
    Montes-de-Oca, R
    Minjarez-Sosa, A
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1997, 45 (02) : 245 - 263