Optimal Policies for Quantum Markov Decision Processes

被引：0

作者：

Ming-Sheng Ying

Yuan Feng

Sheng-Gang Ying

机构：

[1] University of Technology,Centre for Quantum Software and Information

[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software

[3] Tsinghua University,Department of Computer Science and Technology

来源：

International Journal of Automation and Computing | 2021年 / 18卷

关键词：

Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

引用

页码：410 / 421

页数：11

共 50 条

[1] Optimal Policies for Quantum Markov Decision Processes
Ying, Ming-Sheng
Feng, Yuan
Ying, Sheng-Gang
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 410 - 421
[2] Optimal Policies for Quantum Markov Decision Processes
Ming-Sheng Ying
Yuan Feng
Sheng-Gang Ying
Machine Intelligence Research, 2021, 18 (03) : 410 - 421
[3] Optimal Decision Tree Policies for Markov Decision Processes
Vos, Daniel
Verwer, Sicco
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
[4] IDENTIFICATION OF OPTIMAL POLICIES IN MARKOV DECISION PROCESSES
Sladky, Karel
KYBERNETIKA, 2010, 46 (03) : 558 - 570
[5] Optimal adaptive policies for Markov decision processes
Burnetas, AN
Katehakis, MN
MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
[6] MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
SERFOZO, RF
MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 202 - 215
[7] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
Daniel Cruz-Suárez
Raúl Montes-de-Oca
Francisco Salem-Silva
Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
[8] NOTE ON MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
KALIN, D
MATHEMATICAL PROGRAMMING, 1978, 15 (02) : 220 - 222
[9] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
Cruz-Suárez, D
Montes-de-Oca, R
Salem-Silva, F
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2004, 60 (03) : 415 - 436
[10] ISOTONE OPTIMAL POLICIES FOR STRUCTURED MARKOV DECISION-PROCESSES
WHITE, DJ
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1981, 7 (04) : 396 - 402

← 1 2 3 4 5 →