Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning

被引:0
|
作者
Osei, Richard Sakyi [1 ]
Lopez, Daphne [1 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci Engn & Informat Syst, Vellore, India
关键词
Experience replay; experience replay optimization; experience retention strategy; experience selection strategy; replay memory management;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The memorization and reuse of experience, popularly known as experience replay (ER), has improved the performance of off-policy deep reinforcement learning (DRL) algorithms such as deep Q-networks (DQN) and deep deterministic policy gradients (DDPG). Despite its success, ER faces the challenges of noisy transitions, large memory sizes, and unstable returns. Researchers have introduced replay mechanisms focusing on experience selection strategies to address these issues. However, the choice of experience retention strategy has a significant influence on the selection strategy. Experience Replay Optimization (ERO) is a novel reinforcement learning algorithm that uses a deep replay policy for experience selection. However, ERO relies on the naive first-in-first-out (FIFO) retention strategy, which seeks to manage replay memory by constantly retaining recent experiences irrespective of their relevance to the agent's learning. FIFO sequentially overwrites the oldest experience with a new one when the replay memory is full. To improve the retention strategy of ERO, we propose an experience replay optimization with enhanced sequential memory management (ERO-ESMM). ERO-ESMM uses an improved sequential retention strategy to manage the replay memory efficiently and stabilize the performance of the DRL agent. The efficacy of the ESMM strategy is evaluated together with five additional retention strategies across four distinct OpenAI environments. The experimental results indicate that ESMM performs better than the other five fundamental retention strategies.
引用
收藏
页码:715 / 723
页数:9
相关论文
共 50 条
  • [1] Deep Reinforcement Learning with Experience Replay Based on SARSA
    Zhao, Dongbin
    Wang, Haitao
    Shao, Kun
    Zhu, Yuanheng
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [2] Deep Reinforcement Learning With Quantum-Inspired Experience Replay
    Wei, Qing
    Ma, Hailan
    Chen, Chunlin
    Dong, Daoyi
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 9326 - 9338
  • [3] Associative Memory Based Experience Replay for Deep Reinforcement Learning
    Li, Mengyuan
    Kazemi, Arman
    Laguna, Ann Franchesca
    Hu, X. Sharon
    [J]. 2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [4] Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay
    Yin, Haiyan
    Pan, Sinno Jialin
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1640 - 1646
  • [5] Trial and Error Experience Replay Based Deep Reinforcement Learning
    Zhang, Cheng
    Ma, Liang
    [J]. 4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
  • [6] Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
    Wawrzynski, Pawel
    [J]. PROCEEDINGS OF THE INTERNATIONAL NEURAL NETWORK SOCIETY WINTER CONFERENCE (INNS-WC2012), 2012, 13 : 205 - 211
  • [7] Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob
    Nardelli, Nantas
    Farquhar, Gregory
    Afouras, Triantafyllos
    Torr, Philip H. S.
    Kohli, Pushmeet
    Whiteson, Shimon
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [8] Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
    Lin, Yijiong
    Huang, Jiancong
    Zimmer, Matthieu
    Guan, Yisheng
    Rojas, Juan
    Weng, Paul
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04): : 6615 - 6622
  • [9] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    [J]. NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [10] Multimodal fusion for autonomous navigation via deep reinforcement learning with sparse rewards and hindsight experience replay
    Xiao, Wendong
    Yuan, Liang
    Ran, Teng
    He, Li
    Zhang, Jianbo
    Cui, Jianping
    [J]. DISPLAYS, 2023, 78