An Experience Replay Method Based on Tree Structure for Reinforcement Learning

被引:1
|
作者
Jiang, Wei-Cheng [1 ]
Hwang, Kao-Shing [1 ,2 ]
Lin, Jin-Ling [3 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 80424, Taiwan
[2] Kaohsiung Med Univ, Dept Healthcare Adm & Med Informat, Kaohsiung, Taiwan
[3] Shih Hsin Univ, Dept Informat Management, Taipei 116, Taiwan
关键词
Reinforcement learning; Computer architecture; Planning; Adaptation models; Predictive models; Computational modeling; Approximation algorithms; dyna-Q architecture; tree structure; experience replay;
D O I
10.1109/TETC.2018.2890682
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Q-Learning, which is a well-known model-free reinforcement learning algorithm, a learning agent explores an environment to update a state-action function. In reinforcement learning, the agent does not require information about the environment in advance, so an interaction between the agent and the environment is for collecting the real experiences that is also an expensive and time-consuming process. Therefore, to reduce the burden of the interaction, sample efficiency becomes an important role in reinforcement learning. This study proposes an adaptive tree structure integrating with experience replay for Q-Learning, called ERTS-Q. In ERTS-Q method, Q-Learning is used for policy learning, a tree structure establishes a virtual model which perceives two different continuous states after each state transaction, and then the variations of the continuous state are calculated. After each state transition, all states with highly similar variation are aggregated into the same leaf nodes. Otherwise, new leaf nodes will be produced. For experience replay, the tree structure predicts the next state and reward based on the statistical information that is stored in tree nodes. The virtual experiences produced by the tree structure are used for achieving extra learning. Simulations of the mountain car and a maze environment are performed to verify the validity of the proposed modeling learning approach.
引用
收藏
页码:972 / 982
页数:11
相关论文
共 50 条
  • [31] Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
    Lin, Yijiong
    Huang, Jiancong
    Zimmer, Matthieu
    Guan, Yisheng
    Rojas, Juan
    Weng, Paul
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04): : 6615 - 6622
  • [32] Memory Reduction through Experience Classification for Deep Reinforcement Learning with Prioritized Experience Replay
    Shen, Kai-Huan
    Tsai, Pei-Yun
    [J]. PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 166 - 171
  • [33] Double Broad Reinforcement Learning Based on Hindsight Experience Replay for Collision Avoidance of Unmanned Surface Vehicles
    Yu, Jiabao
    Chen, Jiawei
    Chen, Ying
    Zhou, Zhiguo
    Duan, Junwei
    [J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (12)
  • [34] Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning
    Hafez, Muhammad Burhan
    Immisch, Tilman
    Weber, Tom
    Wermter, Stefan
    [J]. FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [35] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Zhang, Cheng
    Ma, Liang
    Schmitz, Alexander
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
  • [36] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Cheng Zhang
    Liang Ma
    Alexander Schmitz
    [J]. International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228
  • [37] Unveiling the Effects of Experience Replay on Deep Reinforcement Learning-based Power Allocation in Wireless Networks
    Kopic, Amna
    Perenda, Erma
    Gacanin, Haris
    [J]. 2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [38] Multi-Input Autonomous Driving Based on Deep Reinforcement Learning With Double Bias Experience Replay
    Cui, Jianping
    Yuan, Liang
    He, Li
    Xiao, Wendong
    Ran, Teng
    Zhang, Jianbo
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (11) : 11253 - 11261
  • [39] Exploring a Reinforcement Learning Agent with Improved Prioritized Experience Replay for a Confrontation Game
    Zhao, Tian
    [J]. 2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 373 - 381
  • [40] Research on Experience Replay of Off-policy Deep Reinforcement Learning: A Review
    Hu, Zi-Jian
    Gao, Xiao-Guang
    Wan, Kai-Fang
    Zhang, Le-Tian
    Wang, Qiang-Long
    Neretin, Evgeny
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (11): : 2237 - 2256