SLER: Self-generated long-term experience replay for continual reinforcement learning

被引:11
|
作者
Li, Chunmao [1 ]
Li, Yang [1 ]
Zhao, Yinliang [1 ]
Peng, Peng [2 ]
Geng, Xupeng [1 ]
机构
[1] Xianjiaotong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China
[2] Inspir Ai, Beijing, Peoples R China
关键词
Continual reinforcement learning; Catastrophic forgetting; Dual experience replay; Experience replay model; NEURAL-NETWORKS; LEVEL; GAME; GO;
D O I
10.1007/s10489-020-01786-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning has achieved significant success in various domains. However, it still faces a huge challenge when learning multiple tasks in sequence. This is because the interaction in a complex setting involves continual learning that results in the change in data distributions over time. A continual learning system should ensure that the agent acquires new knowledge without forgetting the previous one. However, catastrophic forgetting may occur as the new experience can overwrite previous experience due to limited memory size. The dual experience replay algorithm which retains previous experience is widely applied to reduce forgetting, but it cannot be applied in scalable tasks when the memory size is constrained. To alleviate the constrained by the memory size, we propose a new continual reinforcement learning algorithm called Self-generated Long-term Experience Replay (SLER). Our method is different from the standard dual experience replay algorithm, which uses short-term experience replay to retain current task experience, and the long-term experience replay retains all past tasks' experience to achieve continual learning. In this paper, we first trained an environment sample model called Experience Replay Mode (ERM) to generate the simulated state sequence of the previous tasks for knowledge retention. Then combined the ERM with the experience of the new task to generate the simulation experience all previous tasks to alleviate forgetting. Our method can effectively decrease the requirement of memory size in multiple tasks, reinforcement learning. We show that our method in StarCraft II and the GridWorld environments performs better than the state-of-the-art deep learning method and achieve a comparable result to the dual experience replay method, which retains the experience of all the tasks.
引用
收藏
页码:185 / 201
页数:17
相关论文
共 50 条
  • [1] SLER: Self-generated long-term experience replay for continual reinforcement learning
    Chunmao Li
    Yang Li
    Yinliang Zhao
    Peng Peng
    Xupeng Geng
    [J]. Applied Intelligence, 2021, 51 : 185 - 201
  • [2] Experience Replay for Continual Learning
    Rolnick, David
    Ahuja, Arun
    Schwarz, Jonathan
    Lillicrap, Timothy P.
    Wayne, Greg
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Dynamic self-generated fuzzy systems for reinforcement learning
    Er, Meng Joo
    Zhou, Yi
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 193 - +
  • [4] Relay Hindsight Experience Replay: Self-guided continual reinforcement learning for sequential object manipulation tasks with sparse rewards
    Luo, Yongle
    Wang, Yuxin
    Dong, Kun
    Zhang, Qiang
    Cheng, Erkang
    Sun, Zhiyong
    Song, Bo
    [J]. NEUROCOMPUTING, 2023, 557
  • [5] Saliency Guided Experience Packing for Replay in Continual Learning
    Saha, Gobinda
    Roy, Kaushik
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5262 - 5272
  • [6] Rethinking Experience Replay: Bag of Tricks for Continual Learning
    Buzzega, Pietro
    Boschini, Matteo
    Porrello, Angelo
    Calderara, Simone
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2180 - 2187
  • [7] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    [J]. NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [8] Coordinating Experience Replay: A Harmonious Experience Retention approach for Continual Learning
    Ji, Zhong
    Liu, Jiayi
    Wang, Qiang
    Zhang, Zhongfei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 234
  • [9] Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning
    Yan, Qingsen
    Gong, Dong
    Liu, Yuhang
    van den Hengel, Anton
    Shi, Javen Qinfeng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 109 - 118
  • [10] AdaER: An adaptive experience replay approach for continual lifelong learning
    Li, Xingyu
    Tang, Bo
    Li, Haifeng
    [J]. NEUROCOMPUTING, 2024, 572