Brief Announcement: Gradual Learning of Deep Recurrent Neural Network

被引:2
|
作者
Aharoni, Ziv [1 ]
Rattner, Gal [1 ]
Permuter, Haim [1 ]
机构
[1] Ben Gurion Univ Negev, IL-8410501 Beer Sheva, Israel
关键词
Data-processing-inequality; Machine-learning; Recurrent-neural-networks; Regularization; Training-methods;
D O I
10.1007/978-3-319-94147-9_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence modeling tasks. However, deep RNNs are difficult to train and tend to suffer from overfitting. Motivated by the Data Processing Inequality (DPI) we formulate the multi-layered network as a Markov chain, introducing a training method that comprises training the network gradually and using layer-wise gradient clipping. In total, we have found that applying our methods combined with previously introduced regularization and optimization methods resulted in improvement to the state-of-the-art architectures operating in language modeling tasks.
引用
收藏
页码:274 / 277
页数:4
相关论文
共 50 条
  • [21] Adaptive Deep Learning with Optimization Hybrid Convolutional Neural Network and Recurrent Neural Network for Prediction Lemon Fruit Ripeness
    Watnakornbuncha, Darunee
    Am-Dee, Noppadol
    Sangsongfa, Adisak
    PRZEGLAD ELEKTROTECHNICZNY, 2024, 100 (03): : 202 - 211
  • [22] Deep Recurrent Neural Network for Seizure Detection
    Vidyaratne, L.
    Glandon, A.
    Alam, M.
    Iftekharuddin, K. M.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1202 - 1207
  • [23] Gradual learning for behavior acquisition by evolving artificial neural network
    Ooe, Ryosuke
    Kawakami, Takashi
    ARTIFICIAL LIFE AND ROBOTICS, 2016, 21 (04) : 399 - 404
  • [24] Announcement Capture System in Real Environments Using Recurrent Neural Network
    Nakazawa, Shintaro
    Sasaki, Takeshi
    2020 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2020, : 1046 - 1051
  • [25] Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting
    Cui, Zhiyong
    Henrickson, Kristian
    Ke, Ruimin
    Wang, Yinhai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (11) : 4883 - 4894
  • [26] Deep Gradual Multi-Exposure Fusion Via Recurrent Convolutional Network
    Ryu, Je-Ho
    Kim, Jong-Han
    Kim, Jong-Ok
    IEEE ACCESS, 2021, 9 : 144756 - 144767
  • [27] Category learning in a recurrent neural network with reinforcement learning
    Zhang, Ying
    Pan, Xiaochuan
    Wang, Yihong
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [28] Face alignment using a deep neural network with local feature learning and recurrent regression
    Park, Byung-Hwa
    Oh, Se-Young
    Kim, Ig-Jae
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 89 : 66 - 80
  • [29] A learning framework of modified deep recurrent neural network for classification and recognition of voice mood
    Agarwal, Gaurav
    Om, Hari
    Gupta, Sachi
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2022, 36 (08) : 1835 - 1859
  • [30] A Deep Learning Model for Extracting Consumer Sentiments using Recurrent Neural Network Techniques
    Ranjan, Roop
    Daniel, A. K.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 238 - 246