An Experience Replay Method Based on Tree Structure for Reinforcement Learning

被引:1
|
作者
Jiang, Wei-Cheng [1 ]
Hwang, Kao-Shing [1 ,2 ]
Lin, Jin-Ling [3 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 80424, Taiwan
[2] Kaohsiung Med Univ, Dept Healthcare Adm & Med Informat, Kaohsiung, Taiwan
[3] Shih Hsin Univ, Dept Informat Management, Taipei 116, Taiwan
关键词
Reinforcement learning; Computer architecture; Planning; Adaptation models; Predictive models; Computational modeling; Approximation algorithms; dyna-Q architecture; tree structure; experience replay;
D O I
10.1109/TETC.2018.2890682
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Q-Learning, which is a well-known model-free reinforcement learning algorithm, a learning agent explores an environment to update a state-action function. In reinforcement learning, the agent does not require information about the environment in advance, so an interaction between the agent and the environment is for collecting the real experiences that is also an expensive and time-consuming process. Therefore, to reduce the burden of the interaction, sample efficiency becomes an important role in reinforcement learning. This study proposes an adaptive tree structure integrating with experience replay for Q-Learning, called ERTS-Q. In ERTS-Q method, Q-Learning is used for policy learning, a tree structure establishes a virtual model which perceives two different continuous states after each state transaction, and then the variations of the continuous state are calculated. After each state transition, all states with highly similar variation are aggregated into the same leaf nodes. Otherwise, new leaf nodes will be produced. For experience replay, the tree structure predicts the next state and reward based on the statistical information that is stored in tree nodes. The virtual experiences produced by the tree structure are used for achieving extra learning. Simulations of the mountain car and a maze environment are performed to verify the validity of the proposed modeling learning approach.
引用
收藏
页码:972 / 982
页数:11
相关论文
共 50 条
  • [41] Reinforcement learning path planning method incorporating multi-step Hindsight Experience Replay for lightweight robots
    Wang, Jiaqi
    Han, Huiyan
    Han, Xie
    Kuang, Liqun
    Yang, Xiaowen
    [J]. DISPLAYS, 2024, 84
  • [42] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
    Wang, Siqi
    Yin, Xunyuan
    Li, Shaoyuan
    Yin, Xiang
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
  • [43] Re-attentive experience replay in off-policy reinforcement learning
    Wei Wei
    Da Wang
    Lin Li
    Jiye Liang
    [J]. Machine Learning, 2024, 113 : 2327 - 2349
  • [44] Re-attentive experience replay in off-policy reinforcement learning
    Wei, Wei
    Wang, Da
    Li, Lin
    Liang, Jiye
    [J]. MACHINE LEARNING, 2024, 113 (05) : 2327 - 2349
  • [45] Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization
    Wawrzynski, Pawel
    [J]. INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2014, 11 (03)
  • [46] The Effects of Memory Replay in Reinforcement Learning
    Liu, Ruishan
    Zou, James
    [J]. 2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2018, : 478 - 485
  • [47] Structure Aware Experience Replay for Incremental Learning in Graph-based Recommender Systems
    Ahrabian, Kian
    Xu, Yishi
    Zhang, Yingxue
    Wu, Jiapeng
    Wang, Yuening
    Coates, Mark
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2832 - 2836
  • [48] HCS-R-HER: Hierarchical reinforcement learning based on cross subtasks rainbow hindsight experience replay
    Zhao, Xiaotong
    Du, Jingli
    Wang, Zhihan
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2023, 72
  • [49] Experience Replay for Continual Learning
    Rolnick, David
    Ahuja, Arun
    Schwarz, Jonathan
    Lillicrap, Timothy P.
    Wayne, Greg
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [50] A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning
    Ko, Wonshick
    Chang, Dong Eui
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 1483 - 1486