Optimal Policies for Quantum Markov Decision Processes

被引:0
|
作者
Ming-Sheng Ying
Yuan Feng
Sheng-Gang Ying
机构
[1] University of Technology,Centre for Quantum Software and Information
[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software
[3] Tsinghua University,Department of Computer Science and Technology
关键词
Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;
D O I
暂无
中图分类号
学科分类号
摘要
Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
引用
收藏
页码:410 / 421
页数:11
相关论文
共 50 条