Learning to forget: Continual prediction with LSTM

被引:2344
|
作者
Gers, FA [1 ]
Schmidhuber, J [1 ]
Cummins, F [1 ]
机构
[1] IDSIA, CH-6900 Lugano, Switzerland
关键词
D O I
10.1162/089976600300015015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.
引用
收藏
页码:2451 / 2471
页数:21
相关论文
共 50 条
  • [31] Heterogeneous Continual Learning
    Madaan, Divyam
    Yin, Hongxu
    Byeon, Wonmin
    Kautz, Jan
    Molchanov, Pavlo
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15985 - 15995
  • [32] Residual Continual Learning
    Lee, Janghyeon
    Joo, Donggyu
    Hong, Hyeong Gwon
    Kim, Junmo
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4553 - 4560
  • [33] Flashback for Continual Learning
    Mahmoodi, Leila
    Harandi, Mehrtash
    Moghadam, Peyman
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3426 - 3435
  • [34] Kernel Continual Learning
    Derakhshani, Mohammad Mahdi
    Zhen, Xiantong
    Shao, Ling
    Snoek, Cees G. M.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [35] Reinforced Continual Learning
    Xu, Ju
    Zhu, Zhanxing
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [36] Open-world continual learning: Unifying novelty detection and continual learning
    Kim, Gyuhak
    Xiao, Changnan
    Konishi, Tatsuya
    Ke, Zixuan
    Liu, Bing
    [J]. Artificial Intelligence, 2025, 338
  • [37] Continual World: A Robotic Benchmark For Continual Reinforcement Learning
    Wolczyk, Maciej
    Zajac, Michal
    Pascanu, Razvan
    Kucinski, Lukasz
    Milos, Piotr
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] Exemplary Care and Learning Sites: Linking the Continual Improvement of Learning and the Continual Improvement of Care
    Headrick, Linda A.
    Shalaby, Marc
    Baum, Karyn D.
    Fitzsimmons, Anne B.
    Hoffman, Kimberly G.
    Hoglund, Par J.
    Ogrinc, Greg
    Thorne, Karin
    [J]. ACADEMIC MEDICINE, 2011, 86 (11) : E6 - E7
  • [39] Implementation of Deep Learning Predictor (LSTM) Algorithm for Human Mobility Prediction
    Nurhaida I.
    Noprisson H.
    Ayumi V.
    Wei H.
    Putra E.D.
    Utami M.
    Setiawan H.
    [J]. International Journal of Interactive Mobile Technologies, 2020, 14 (18) : 132 - 144
  • [40] Analysis of Deep Learning Models for Early Action Prediction Using LSTM
    Manju, D.
    Seetha, M.
    Sammulal, P.
    [J]. INVENTIVE COMPUTATION AND INFORMATION TECHNOLOGIES, ICICIT 2021, 2022, 336 : 879 - 888