Quantum computation for action selection using reinforcement learning

被引:14
|
作者
C. L. Chen [1 ]
D. Y. Dong [1 ]
Z. H. Chen [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230027, Anhui, Peoples R China
关键词
quantum computation; action selection; reinforcement learning; Grover iteration;
D O I
10.1142/S0219749906002419
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper proposes a novel action selection method based on quantum computation and reinforcement learning (RL). Inspired by the advantages of quantum computation, the state/action in a RL system is represented with quantum superposition state. The probability of action eigenvalue is denoted by probability amplitude, which is updated according to rewards. And the action selection is carried out by observing quantum state according to collapse postulate of quantum measurement. The results of simulated experiments show that quantum computation can be effectively used to action selection and decision making through speeding up learning. This method also makes a good tradeoff between exploration and exploitation for RL using probability characteristics of quantum theory.
引用
收藏
页码:1071 / 1083
页数:13
相关论文
共 50 条
  • [1] Using suitable action selection rule in reinforcement learning
    Ohta, M
    Kumada, Y
    Noda, I
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4358 - 4363
  • [2] Adaptive Action Selection in Autonomic Software Using Reinforcement Learning
    Amoui, Mehdi
    Salehie, Mazeiar
    Mirarab, Siavash
    Tahvildari, Ladan
    [J]. FOURTH INTERNATIONAL CONFERENCE ON AUTONOMIC AND AUTONOMOUS SYSTEMS (ICAS 2008), 2008, : 175 - 181
  • [3] Adaptive Action Selection Using Utility-based Reinforcement Learning
    Chen, Kunrong
    Lin, Fen
    Tan, Qing
    Shi, Zhongzhi
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 67 - 72
  • [4] Safe reinforcement learning under temporal logic with reward design and quantum action selection
    Cai, Mingyu
    Xiao, Shaoping
    Li, Junchao
    Kan, Zhen
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [5] Safe reinforcement learning under temporal logic with reward design and quantum action selection
    Mingyu Cai
    Shaoping Xiao
    Junchao Li
    Zhen Kan
    [J]. Scientific Reports, 13
  • [6] Empirical studies in action selection with reinforcement learning
    Whiteson, Shimon
    Taylor, Matthew E.
    Stone, Peter
    [J]. ADAPTIVE BEHAVIOR, 2007, 15 (01) : 33 - 50
  • [7] Quantum computation using action variables
    Zhang, Yong
    Wu, Konglong
    [J]. QUANTUM INFORMATION PROCESSING, 2022, 21 (10)
  • [8] Quantum computation using action variables
    Yong Zhang
    Konglong Wu
    [J]. Quantum Information Processing, 21
  • [9] A new criterion using information gain for action selection strategy in reinforcement learning
    Iwata, K
    Ikeda, K
    Sakai, H
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (04): : 792 - 799
  • [10] Reinforcement learning decoders for fault-tolerant quantum computation
    Sweke, Ryan
    Kesselring, Markus S.
    van Nieuwenburg, Evert P. L.
    Eisert, Jens
    [J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (02):