Optimal Policies for Quantum Markov Decision Processes

被引：0

作者：

Ming-Sheng Ying

Yuan Feng

Sheng-Gang Ying

机构：

[1] University of Technology,Centre for Quantum Software and Information

[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software

[3] Tsinghua University,Department of Computer Science and Technology

来源：

International Journal of Automation and Computing | 2021年 / 18卷

关键词：

Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.

引用

页码：410 / 421

页数：11

共 50 条

[41] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE REWARD MARKOV DECISION-PROCESSES WITH A RECURRENT STATE
CAVAZOSCADENA, R
APPLIED MATHEMATICS AND OPTIMIZATION, 1992, 26 (02): : 171 - 194
[42] Computing semi-stationary optimal policies for multichain semi-Markov decision processes
Prasenjit Mondal
Annals of Operations Research, 2020, 287 : 843 - 865
[43] Computing semi-stationary optimal policies for multichain semi-Markov decision processes
Mondal, Prasenjit
ANNALS OF OPERATIONS RESEARCH, 2020, 287 (02) : 843 - 865
[44] CONDITIONS FOR EXISTENCE OF AVERAGE AND BLACKWELL OPTIMAL STATIONARY POLICIES IN DENUMERABLE MARKOV DECISION-PROCESSES
LASSERRE, JB
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1988, 136 (02) : 479 - 489
[45] Learning Policies for Markov Decision Processes in Continuous Spaces
Paternain, Santiago
Bazerque, Juan Andres
Small, Austin
Ribeiro, Alejandro
2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 4751 - 4758
[46] Finding Safe Zones of Markov Decision Processes Policies
Cohen, Lee
Mansour, Yishay
Moshkovitz, Michal
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[47] Learning Policies for Markov Decision Processes From Data
Hanawal, Manjesh Kumar
Liu, Hao
Zhu, Henghui
Paschalidis, Ioannis Ch.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (06) : 2298 - 2309
[48] Efficient Policies for Stationary Possibilistic Markov Decision Processes
Ben Amor, Nahla
El Khalfi, Zeineb
Fargier, Helene
Sabaddin, Regis
SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2017, 2017, 10369 : 306 - 317
[49] Reachability analysis of quantum Markov decision processes
Ying, Shenggang
Ying, Mingsheng
INFORMATION AND COMPUTATION, 2018, 263 : 31 - 51
[50] Quantum partially observable Markov decision processes
Barry, Jennifer
Barry, Daniel T.
Aaronson, Scott
PHYSICAL REVIEW A, 2014, 90 (03):

← 1 2 3 4 5 →