Learning offline: memory replay in biological and artificial reinforcement learning

被引:17
|
作者
Roscow, Emma L. [1 ]
Chua, Raymond [2 ]
Costa, Rui Ponte [3 ]
Jones, Matt W. [4 ]
Lepora, Nathan [5 ,6 ]
机构
[1] Ctr Recerca Matemat, Bellaterra, Spain
[2] McGill Univ & Mila, Montreal, PQ, Canada
[3] Univ Bristol, Dept Comp Sci, Intelligent Syst Lab, Bristol Computat Neurosci Unit, Bristol, Avon, England
[4] Univ Bristol, Sch Physiol Pharmacol & Neurosci, Bristol, Avon, England
[5] Univ Bristol, Dept Engn Math, Bristol, Avon, England
[6] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England
基金
英国惠康基金; 加拿大自然科学与工程研究理事会;
关键词
SHARP-WAVE RIPPLES; HIPPOCAMPAL REPLAY; PREFRONTAL CORTEX; VALUE REPRESENTATIONS; NEURAL-NETWORKS; SPATIAL MEMORY; VISUAL-CORTEX; AWAKE REPLAY; REACTIVATION; SLEEP;
D O I
10.1016/j.tins.2021.07.007
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Learning to act in an environment to maximise rewards is among the brain's key functions. This process has often been conceptualised within the framework of reinforcement learning, which has also gained prominence in machine learning and artificial intelligence (AI) as a way to optimise decision making. A common aspect of both biological and machine reinforcement learning is the reactivation of previously experienced episodes, referred to as replay. Replay is important for memory consolidation in biological neural networks and is key to stabilising learning in deep neural networks. Here, we review recent developments concerning the functional roles of replay in the fields of neuroscience and AI. Complementary progress suggests how replay might support learning processes, including generalisation and continual learning, affording opportunities to transfer knowledge across the two fields to advance the understanding of biological and artificial learning and memory.
引用
下载
收藏
页码:808 / 821
页数:14
相关论文
共 50 条
  • [41] Memory Reduction through Experience Classification for Deep Reinforcement Learning with Prioritized Experience Replay
    Shen, Kai-Huan
    Tsai, Pei-Yun
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 166 - 171
  • [42] Parallelized Synchronous Multi-Agent Deep Reinforcement Learning with Experience Replay Memory
    Gong, Xudong
    Ding, Bo
    Xu, Jie
    Wang, Huaimin
    Zhou, Xing
    Feng, Dawei
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE) / 10TH INTERNATIONAL WORKSHOP ON JOINT CLOUD COMPUTING (JCC) / IEEE INTERNATIONAL WORKSHOP ON CLOUD COMPUTING IN ROBOTIC SYSTEMS (CCRS), 2019, : 325 - 330
  • [43] Adaptable Conservative Q-Learning for Offline Reinforcement Learning
    Qiu, Lyn
    Li, Xu
    Liang, Lenghan
    Sun, Mingming
    Yan, Junchi
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 200 - 212
  • [44] Doubly constrained offline reinforcement learning for learning path recommendation
    Yun, Yue
    Dai, Huan
    An, Rui
    Zhang, Yupei
    Shang, Xuequn
    Knowledge-Based Systems, 2024, 284
  • [45] Doubly constrained offline reinforcement learning for learning path recommendation
    Yun, Yue
    Dai, Huan
    An, Rui
    Zhang, Yupei
    Shang, Xuequn
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [46] Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
    Rashidinejad, Paria
    Zhu, Banghua
    Ma, Cong
    Jiao, Jiantao
    Russell, Stuart
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [47] Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
    Rashidinejad, Paria
    Zhu, Banghua
    Ma, Cong
    Jiao, Jiantao
    Russell, Stuart
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (12) : 8156 - 8196
  • [48] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
    Zheng, Han
    Luo, Xufang
    Wei, Pengfei
    Song, Xuan
    Li, Dongsheng
    Jiang, Jing
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380
  • [49] Mildly Conservative Q-Learning for Offline Reinforcement Learning
    Lyu, Jiafei
    Ma, Xiaoteng
    Li, Xiu
    Lu, Zongqing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [50] Deadly triad matters for offline reinforcement learning
    Peng, Zhiyong
    Liu, Yadong
    Zhou, Zongtan
    KNOWLEDGE-BASED SYSTEMS, 2024, 284