aSNAQ: An adaptive stochastic Nesterov's accelerated quasi-Newton method for training RNNs

被引:0
|
作者
Sendilkkumaar, Indrapriyadarsini [1 ]
Mahboubi, Shahrzad [2 ]
Ninomiya, Hiroshi [2 ]
Asai, Hideki [3 ]
机构
[1] Shizuoka Univ, Grad Sch Sci & Technol, Naka Ku, 3-5-1 Johoku, Hamamatsu, Shizuoka 4328561, Japan
[2] Shonan Inst Technol, Grad Sch Elect & Informat Engn, 1-1-25 Tsujido Nishikaigan, Fujisawa, Kanagawa 2518511, Japan
[3] Shizuoka Univ, Res Inst Elect, Naka Ku, 3-5-1 Johoku, Hamamatsu, Shizuoka 4328561, Japan
来源
关键词
Recurrent neural network; training algorithm; Nesterov's accelerated quasi-Newton; stochastic method; Tensorflow; OPTIMIZATION;
D O I
10.1587/nolta.11.409
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Recurrent Neural Networks (RNNs) are powerful sequence models that are particularly difficult to train. This paper proposes an adaptive stochastic Nesterov's accelerated quasi-Newton (aSNAQ) method for training RNNs. Several algorithms have been proposed earlier for training RNNs. However, due to high computational complexity, very few methods use second-order curvature information despite its ability to improve convergence. The proposed method is an accelerated second-order method that attempts to incorporate curvature information while maintaining a low per iteration cost. Furthermore, direction normalization has been introduced to solve the vanishing and/or exploding gradient problem that is prominent in training RNNs. The performance of the proposed method is evaluated in Tensorflow on benchmark sequence modeling problems. The results show that the proposed aSNAQ method is effective in training RNNs with a low per-iteration cost and improved performance compared to the second-order adaQN and first-order Adagrad and Adam methods.
引用
收藏
页码:409 / 421
页数:13
相关论文
共 50 条
  • [1] A Stochastic Quasi-Newton Method with Nesterov's Accelerated Gradient
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Asai, Hideki
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 743 - 760
  • [2] The regularized stochastic Nesterov?s accelerated Quasi-Newton method with applications
    Makmuang, Dawrawee
    Suppalap, Siwakon
    Wangkeeree, Rabian
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2023, 428
  • [3] On the Practical Robustness of the Nesterov's Accelerated Quasi-Newton Method
    Indrapriyadarsini, S.
    Ninomiya, Hiroshi
    Kamio, Takeshi
    Asai, Hideki
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12884 - 12885
  • [4] Implementation of a modified Nesterov's Accelerated quasi-Newton Method on Tensorflow
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Asai, Hideki
    [J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 1147 - 1154
  • [5] Neural Network Training based on quasi-Newton Method using Nesterov's Accelerated Gradient
    Ninomiya, Hiroshi
    [J]. PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 51 - 54
  • [6] A Nesterov's accelerated quasi-Newton method for global routing using deep reinforcement learning
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Kamio, Takeshi
    Asai, Hideki
    [J]. IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2021, 12 (03): : 323 - 335
  • [7] A Stochastic Momentum Accelerated Quasi-Newton Method for Neural Networks
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Kamio, Takeshi
    Asai, Hideki
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12973 - 12974
  • [8] A Neural Network approach to Analog Circuit Design Optimization using Nesterov's Accelerated Quasi-Newton Method
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Kamio, Takeshi
    Asai, Hideki
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [9] Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov's Gradient for Training Neural Networks
    Indrapriyadarsini, S.
    Mahboubi, Shahrzad
    Ninomiya, Hiroshi
    Kamio, Takeshi
    Asai, Hideki
    [J]. ALGORITHMS, 2022, 15 (01)
  • [10] STOCHASTIC QUASI-NEWTON METHOD FOR NONCONVEX STOCHASTIC OPTIMIZATION
    Wang, Xiao
    Ma, Shiqian
    Goldfarb, Donald
    Liu, Wei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2017, 27 (02) : 927 - 956