Optimal Policies for Quantum Markov Decision Processes

被引:0
|
作者
Ming-Sheng Ying
Yuan Feng
Sheng-Gang Ying
机构
[1] University of Technology,Centre for Quantum Software and Information
[2] Chinese Academy of Sciences,State Key Laboratory of Computer Science, Institute of Software
[3] Tsinghua University,Department of Computer Science and Technology
关键词
Quantum Markov decision processes; quantum machine learning; reinforcement learning; dynamic programming; decision making;
D O I
暂无
中图分类号
学科分类号
摘要
Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
引用
收藏
页码:410 / 421
页数:11
相关论文
共 50 条
  • [1] Optimal Policies for Quantum Markov Decision Processes
    Ying, Ming-Sheng
    Feng, Yuan
    Ying, Sheng-Gang
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 410 - 421
  • [2] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    Machine Intelligence Research, 2021, 18 (03) : 410 - 421
  • [3] Optimal Decision Tree Policies for Markov Decision Processes
    Vos, Daniel
    Verwer, Sicco
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
  • [4] IDENTIFICATION OF OPTIMAL POLICIES IN MARKOV DECISION PROCESSES
    Sladky, Karel
    KYBERNETIKA, 2010, 46 (03) : 558 - 570
  • [5] Optimal adaptive policies for Markov decision processes
    Burnetas, AN
    Katehakis, MN
    MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
  • [6] MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    SERFOZO, RF
    MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 202 - 215
  • [7] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Daniel Cruz-Suárez
    Raúl Montes-de-Oca
    Francisco Salem-Silva
    Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
  • [8] NOTE ON MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    KALIN, D
    MATHEMATICAL PROGRAMMING, 1978, 15 (02) : 220 - 222
  • [9] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Cruz-Suárez, D
    Montes-de-Oca, R
    Salem-Silva, F
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2004, 60 (03) : 415 - 436