Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning

被引：0

作者：

Hafez, Muhammad Burhan ^{[1
]}

Immisch, Tilman ^{[1
]}

Weber, Tom ^{[1
]}

Wermter, Stefan ^{[1
]}

机构：

[1] Univ Hamburg, Dept Informat, Knowledge Technol Res Grp, Hamburg, Germany

来源：

FRONTIERS IN NEUROROBOTICS | 2023年 / 17卷

关键词：

continual learning; reinforcement learning; cognitive robotics; catastrophic forgetting; experience replay; growing self-organizing maps; GO; SHOGI; LEVEL; CHESS;

D O I：

10.3389/fnbot.2023.1127642

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning (RL) agents often suffer from catastrophic forgetting, forgetting previously found solutions in parts of the input space when training new data. Replay memories are a common solution to the problem by decorrelating and shuffling old and new training samples. They naively store state transitions as they arrive, without regard for redundancy. We introduce a novel cognitive-inspired replay memory approach based on the Grow-When-Required (GWR) self-organizing network, which resembles a map-based mental model of the world. Our approach organizes stored transitions into a concise environment-model-like network of state nodes and transition edges, merging similar samples to reduce the memory size and increase pair-wise distance among samples, which increases the relevancy of each sample. Overall, our study shows that map-based experience replay allows for significant memory reduction with only small decreases in performance.

引用

页数：13

共 50 条

[21] GradMA: A Gradient-Memory-based Accelerated Federated Learning with Alleviated Catastrophic Forgetting
Luo, Kangyang
Li, Xiang
Lan, Yunshi
Gao, Ming
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3708 - 3717
[22] Batch process control based on reinforcement learning with segmented prioritized experience replay
Xu, Chen
Ma, Junwei
Tao, Hongfeng
[J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (05)
[23] RMRL: Robot Navigation in Crowd Environments With Risk Map-Based Deep Reinforcement Learning
Yang, Haodong
Yao, Chenpeng
Liu, Chengju
Chen, Qijun
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 7930 - 7937
[24] REPLAY BUFFER WITH LOCAL FORGETTING FOR ADAPTING TO LOCAL ENVIRONMENT CHANGES IN DEEP MODEL-BASED REINFORCEMENT LEARNING
Rahimi-Kalahroudi, Ali
Rajendran, Janarthanan
Momennejad, Ida
van Seijen, Harm
Chandar, Sarath
[J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 21 - 42
[25] Learning-Based Symbol Level Precoding: A Memory-Efficient Unsupervised Learning Approach
Mohammad, Abdullahi
Masouros, Christos
Andreopoulos, Yiannis
[J]. 2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 429 - 434
[26] Deep Reinforcement Learning Based on the Hindsight Experience Replay for Autonomous Driving of Mobile Robot
Park, Minjae
Hong, Jin Seok
Kwon, Nam Kyu
[J]. Journal of Institute of Control, Robotics and Systems, 2022, 28 (11) : 1006 - 1012
[27] Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation
Zhang, Tiantian
Wang, Xueqian
Liang, Bin
Yuan, Bo
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) : 9925 - 9939
[28] Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids
Panda, Deepak Kumar
Turner, Oliver
Das, Saptarshi
Abusara, Mohammad
[J]. JOURNAL OF CLEANER PRODUCTION, 2024, 434
[29] A Self-Organizing Map-Based Adaptive Traffic Light Control System with Reinforcement Learning
Kao, Ying-Cih
Wu, Cheng-Wen
[J]. 2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 2060 - 2064
[30] Memory-Efficient Model-Based Deep Learning With Convergence and Robustness Guarantees
Pramanik, Aniket
Zimmerman, M. Bridget
Jacob, Mathews
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2023, 9 : 260 - 275

← 1 2 3 4 5 →