Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引:1820
|
作者
Sherstinsky, Alex
机构
关键词
RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;
D O I
10.1016/j.physd.2019.132306
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.
引用
下载
收藏
页数:28
相关论文
共 50 条
  • [1] Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
    Sherstinsky, Alex
    arXiv, 2018,
  • [2] Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) Power Forecasting
    Alsabban, Maha S.
    Salem, Nema
    Malik, Hebatullah M.
    APPEEC 2021: 2021 13TH IEEE PES ASIA PACIFIC POWER & ENERGY ENGINEERING CONFERENCE (APPEEC), 2021,
  • [3] Using a long short-term memory recurrent neural network (LSTM-RNN) to classify network attacks
    Muhuri P.S.
    Chatterjee P.
    Yuan X.
    Roy K.
    Esterline A.
    Information (Switzerland), 2020, 11 (05):
  • [4] Using a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) to Classify Network Attacks
    Muhuri, Pramita Sree
    Chatterjee, Prosenjit
    Yuan, Xiaohong
    Roy, Kaushik
    Esterline, Albert
    INFORMATION, 2020, 11 (05)
  • [5] Recognition of Handwritten Text using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN)
    Paul, I. Joe Louis
    Sasirekha, S.
    Vishnu, D. Raghul
    Surya, K.
    RECENT DEVELOPMENTS IN MATHEMATICAL ANALYSIS AND COMPUTING, 2019, 2095
  • [6] Long short-term memory (LSTM) recurrent neural network for muscle activity detection
    Ghislieri, Marco
    Cerone, Giacinto Luigi
    Knaflitz, Marco
    Agostini, Valentina
    JOURNAL OF NEUROENGINEERING AND REHABILITATION, 2021, 18 (01)
  • [7] Long short-term memory (LSTM) recurrent neural network for muscle activity detection
    Marco Ghislieri
    Giacinto Luigi Cerone
    Marco Knaflitz
    Valentina Agostini
    Journal of NeuroEngineering and Rehabilitation, 18
  • [8] Prediction of Indonesian Palm Oil Production Using Long Short-Term Memory Recurrent Neural Network (LSTM-RNN)
    Sugiyarto, Aditya Wisnugraha
    Abadi, Agus Maman
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA SCIENCES (AIDAS2019), 2019, : 53 - 57
  • [9] Artificial Intelligence for Sport Actions and Performance Analysis using Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM)
    Fok, Wilton W. T.
    Chan, Louis C. W.
    Chen, Carol
    ICRAI 2018: PROCEEDINGS OF 2018 4TH INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE -, 2018, : 40 - 44
  • [10] HOURLY DISCHARGE PREDICTION USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK (LSTM-RNN) IN THE UPPER CITARUM RIVER
    Enung
    Kusuma, Muhammad Syahril Badri
    Kardhana, Hadi
    Suryadi, Yadi
    Rohmat, Faizal Immaddudin Wira
    INTERNATIONAL JOURNAL OF GEOMATE, 2022, 23 (98): : 147 - 154