Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引:1820
|
作者
Sherstinsky, Alex
机构
关键词
RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;
D O I
10.1016/j.physd.2019.132306
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.
引用
下载
收藏
页数:28
相关论文
共 50 条
  • [21] Applying Long Short-Term Memory Recurrent Neural Network for Intrusion Detection
    Althubiti, Sara
    Nick, William
    Mason, Janelle
    Yuan, Xiaohong
    Esterline, Albert
    IEEE SOUTHEASTCON 2018, 2018,
  • [22] Long Short-Term Memory Recurrent Neural Network Architectures for Melody Generation
    Mishra, Abhinav
    Tripathi, Kshitij
    Gupta, Lakshay
    Singh, Krishna Pratap
    SOFT COMPUTING FOR PROBLEM SOLVING, 2019, 817 : 41 - 55
  • [23] Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
    Oruh, Jane
    Viriri, Serestina
    Adegun, Adekanmi
    IEEE ACCESS, 2022, 10 : 30069 - 30079
  • [24] Short-term Traffic Flow Prediction with LSTM Recurrent Neural Network
    Kang, Danqing
    Lv, Yisheng
    Chen, Yuan-yuan
    2017 IEEE 20TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2017,
  • [25] An Intelligent Recurrent Neural Network with Long Short-Term Memory (LSTM) BASED Batch Normalization for Medical Image Denoising
    Rajeev, R.
    Samath, J. Abdul
    Karthikeyan, N. K.
    JOURNAL OF MEDICAL SYSTEMS, 2019, 43 (08)
  • [26] An Intelligent Recurrent Neural Network with Long Short-Term Memory (LSTM) BASED Batch Normalization for Medical Image Denoising
    R. Rajeev
    J. Abdul Samath
    N. K. Karthikeyan
    Journal of Medical Systems, 2019, 43
  • [27] Long short-term memory (LSTM) recurrent neural network for low-flow hydrological time series forecasting
    Sahoo, Bibhuti Bhusan
    Jha, Ramakar
    Singh, Anshuman
    Kumar, Deepak
    ACTA GEOPHYSICA, 2019, 67 (05) : 1471 - 1481
  • [28] Long short-term memory (LSTM) recurrent neural network for low-flow hydrological time series forecasting
    Bibhuti Bhusan Sahoo
    Ramakar Jha
    Anshuman Singh
    Deepak Kumar
    Acta Geophysica, 2019, 67 : 1471 - 1481
  • [29] Long Short-term Memory Neural Network for Network Traffic Prediction
    Zhuo, Qinzheng
    Li, Qianmu
    Yan, Han
    Qi, Yong
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [30] Simplified Gating in Long Short-term Memory (LSTM) Recurrent Neural Networks
    Lu, Yuzhen
    Salem, Fathi M.
    2017 IEEE 60TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2017, : 1601 - 1604