Learning to forget: Continual prediction with LSTM

被引：2337

作者：

Gers, FA ^{[1
]}

Schmidhuber, J ^{[1
]}

Cummins, F ^{[1
]}

机构：

[1] IDSIA, CH-6900 Lugano, Switzerland

来源：

NEURAL COMPUTATION | 2000年 / 12卷 / 10期

关键词：

D O I：

10.1162/089976600300015015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.

引用

页码：2451 / 2471

页数：21

共 50 条

[1] Learning to forget: Continual prediction with LSTM
Gers, FA
Schmidhuber, J
Cummins, F
[J]. NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 850 - 855
[2] Continual prediction using LSTM with forget gates
Gers, FA
Schmidhuber, J
Cummins, F
[J]. NEURAL NETS - WIRN VIETRI-99, 1999, : 133 - 138
[3] Forget-free Continual Learning withWinning Subnetworks
Kang, Haeyong
Mina, Rusty John Lloyd
Madjid, Sultan Rizky Hikmawan
Yoon, Jaehong
Hasegawa-Johnson, Mark
Hwang, Sung Ju
Yoo, Chang D.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10734 - 10750
[4] Prediction and Control in Continual Reinforcement Learning
Anand, Nishanth
Precup, Doina
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] Expandable Orbit Decay Prediction Using Continual Learning
He, Junhua
Wang, Hua
Wang, Haitao
Fang, Xuankun
Huo, Chengyi
[J]. INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2024, 2024
[6] Temporal Continual Learning with Prior Compensation for Human Motion Prediction
Tang, Jianwei
Sun, Jiangxin
Lin, Xiaotong
Zhang, Lifang
Zheng, Wei-Shi
Hu, Jian-Fang
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[7] Continual learning for seizure prediction via memory projection strategy
Shi, Yufei
Tang, Shishi
Li, Yuxuan
He, Zhipeng
Tang, Shengsheng
Wang, Ruixuan
Zheng, Weishi
Chen, Ziyi
Zhou, Yi
[J]. Computers in Biology and Medicine, 2024, 181
[8] CONTINUAL LEARNING
BROWN, WE
[J]. JOURNAL OF THE AMERICAN DENTAL ASSOCIATION, 1965, 71 (04): : 935 - &
[9] Continual learning
King, Denise
[J]. JOURNAL OF EMERGENCY NURSING, 2008, 34 (04) : 283 - 283
[10] A Comparisons of BKT, RNN and LSTM for Learning Gain Prediction
Lin, Chen
Chi, Min
[J]. ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2017, 2017, 10331 : 536 - 539

← 1 2 3 4 5 →