Brief Announcement: Gradual Learning of Deep Recurrent Neural Network

被引:2
|
作者
Aharoni, Ziv [1 ]
Rattner, Gal [1 ]
Permuter, Haim [1 ]
机构
[1] Ben Gurion Univ Negev, IL-8410501 Beer Sheva, Israel
关键词
Data-processing-inequality; Machine-learning; Recurrent-neural-networks; Regularization; Training-methods;
D O I
10.1007/978-3-319-94147-9_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence modeling tasks. However, deep RNNs are difficult to train and tend to suffer from overfitting. Motivated by the Data Processing Inequality (DPI) we formulate the multi-layered network as a Markov chain, introducing a training method that comprises training the network gradually and using layer-wise gradient clipping. In total, we have found that applying our methods combined with previously introduced regularization and optimization methods resulted in improvement to the state-of-the-art architectures operating in language modeling tasks.
引用
收藏
页码:274 / 277
页数:4
相关论文
共 50 条
  • [1] SpinalNet: Deep Neural Network With Gradual Input
    Dipu Kabir H.M.
    Abdar M.
    Khosravi A.
    Jalali S.M.J.
    Atiya A.F.
    Nahavandi S.
    Srinivasan D.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (05): : 1165 - 1177
  • [2] Deep evidential learning in diffusion convolutional recurrent neural network
    Feng, Zhiyuan
    Qi, Kai
    Shi, Bin
    Mei, Hao
    Zheng, Qinghua
    Wei, Hua
    ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (04): : 2252 - 2264
  • [3] Supervised Brain Network Learning Based on Deep Recurrent Neural Networks
    Zhao, Shijie
    Cui, Yan
    Huang, Linwei
    Xie, Li
    Chen, Yaowu
    Han, Junwei
    Guo, Lei
    Zhang, Shu
    Liu, Tianming
    Lv, Jinglei
    IEEE ACCESS, 2020, 8 (08): : 69967 - 69978
  • [4] Recurrent and Deep Learning Neural Network Models for DDoS Attack Detection
    Sumathi, S.
    Rajesh, R.
    Lim, Sangsoon
    JOURNAL OF SENSORS, 2022, 2022
  • [5] Enhancing Deep Neural Network Saliency Visualizations With Gradual Extrapolation
    Szandala, Tomasz
    IEEE ACCESS, 2021, 9 : 95155 - 95161
  • [6] A hybrid deep learning model by combining convolutional neural network and recurrent neural network to detect forest fire
    Rajib Ghosh
    Anupam Kumar
    Multimedia Tools and Applications, 2022, 81 : 38643 - 38660
  • [7] A hybrid deep learning model by combining convolutional neural network and recurrent neural network to detect forest fire
    Ghosh, Rajib
    Kumar, Anupam
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (27) : 38643 - 38660
  • [8] A Review on Deep Learning with Focus on Deep Recurrent Neural Network for Electricity Forecasting in Residential Building
    Abdulrahman, Mustapha Lawal
    Ibrahim, Kabiru Musa
    Gital, Abdusalam Yau
    Zambuk, Fatima Umar
    Ja'afaru, Badamasi
    Yakubu, Zahraddeen Ismail
    Ibrahim, Abubakar
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 141 - 154
  • [9] Signal Processing for Diffuse Correlation Spectroscopy with Recurrent Neural Network of Deep Learning
    Zhang, Peng
    Gui, Zhiguo
    Hao, Ling
    Zhang, Xiaojuan
    Liu, Caicai
    Shang, Yu
    2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2019), 2019, : 328 - 332
  • [10] Hybrid deep learning diagonal recurrent neural network controller for nonlinear systems
    El-Nagar, Ahmad M.
    Zaki, Ahmad M.
    Soliman, F. A. S.
    El-Bardini, Mohammad
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (24): : 22367 - 22386