Learning offline: memory replay in biological and artificial reinforcement learning

被引：17

作者：

Roscow, Emma L. ^{[1
]}

Chua, Raymond ^{[2
]}

Costa, Rui Ponte ^{[3
]}

Jones, Matt W. ^{[4
]}

Lepora, Nathan ^{[5
,6
]}

机构：

[1] Ctr Recerca Matemat, Bellaterra, Spain

[2] McGill Univ & Mila, Montreal, PQ, Canada

[3] Univ Bristol, Dept Comp Sci, Intelligent Syst Lab, Bristol Computat Neurosci Unit, Bristol, Avon, England

[4] Univ Bristol, Sch Physiol Pharmacol & Neurosci, Bristol, Avon, England

[5] Univ Bristol, Dept Engn Math, Bristol, Avon, England

[6] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England

来源：

TRENDS IN NEUROSCIENCES | 2021年 / 44卷 / 10期

基金：

英国惠康基金; 加拿大自然科学与工程研究理事会;

关键词：

SHARP-WAVE RIPPLES; HIPPOCAMPAL REPLAY; PREFRONTAL CORTEX; VALUE REPRESENTATIONS; NEURAL-NETWORKS; SPATIAL MEMORY; VISUAL-CORTEX; AWAKE REPLAY; REACTIVATION; SLEEP;

D O I：

10.1016/j.tins.2021.07.007

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Learning to act in an environment to maximise rewards is among the brain's key functions. This process has often been conceptualised within the framework of reinforcement learning, which has also gained prominence in machine learning and artificial intelligence (AI) as a way to optimise decision making. A common aspect of both biological and machine reinforcement learning is the reactivation of previously experienced episodes, referred to as replay. Replay is important for memory consolidation in biological neural networks and is key to stabilising learning in deep neural networks. Here, we review recent developments concerning the functional roles of replay in the fields of neuroscience and AI. Complementary progress suggests how replay might support learning processes, including generalisation and continual learning, affording opportunities to transfer knowledge across the two fields to advance the understanding of biological and artificial learning and memory.

引用

下载

页码：808 / 821

页数：14

共 50 条

[41] Memory Reduction through Experience Classification for Deep Reinforcement Learning with Prioritized Experience Replay
Shen, Kai-Huan
Tsai, Pei-Yun
PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 166 - 171
[42] Parallelized Synchronous Multi-Agent Deep Reinforcement Learning with Experience Replay Memory
Gong, Xudong
Ding, Bo
Xu, Jie
Wang, Huaimin
Zhou, Xing
Feng, Dawei
2019 13TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE) / 10TH INTERNATIONAL WORKSHOP ON JOINT CLOUD COMPUTING (JCC) / IEEE INTERNATIONAL WORKSHOP ON CLOUD COMPUTING IN ROBOTIC SYSTEMS (CCRS), 2019, : 325 - 330
[43] Adaptable Conservative Q-Learning for Offline Reinforcement Learning
Qiu, Lyn
Li, Xu
Liang, Lenghan
Sun, Mingming
Yan, Junchi
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 200 - 212
[44] Doubly constrained offline reinforcement learning for learning path recommendation
Yun, Yue
Dai, Huan
An, Rui
Zhang, Yupei
Shang, Xuequn
Knowledge-Based Systems, 2024, 284
[45] Doubly constrained offline reinforcement learning for learning path recommendation
Yun, Yue
Dai, Huan
An, Rui
Zhang, Yupei
Shang, Xuequn
KNOWLEDGE-BASED SYSTEMS, 2024, 284
[46] Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Rashidinejad, Paria
Zhu, Banghua
Ma, Cong
Jiao, Jiantao
Russell, Stuart
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[47] Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Rashidinejad, Paria
Zhu, Banghua
Ma, Cong
Jiao, Jiantao
Russell, Stuart
IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (12) : 8156 - 8196
[48] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Zheng, Han
Luo, Xufang
Wei, Pengfei
Song, Xuan
Li, Dongsheng
Jiang, Jing
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380
[49] Mildly Conservative Q-Learning for Offline Reinforcement Learning
Lyu, Jiafei
Ma, Xiaoteng
Li, Xiu
Lu, Zongqing
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[50] Deadly triad matters for offline reinforcement learning
Peng, Zhiyong
Liu, Yadong
Zhou, Zongtan
KNOWLEDGE-BASED SYSTEMS, 2024, 284

← 1 2 3 4 5 →