AdaXod: a new adaptive and momental bound algorithm for training deep neural networks

被引:0
|
作者
Yuanxuan Liu
Dequan Li
机构
[1] Anhui University of Science and Technology,School of Mathematics and Big Data
[2] Anhui University of Science and Technology,School of Artificial Intelligence
来源
关键词
Adaptive algorithm; Deep neural network; Image classification; Adaptive and momental bound;
D O I
暂无
中图分类号
学科分类号
摘要
Adaptive algorithms are widely used in deep learning because of their fast convergence. Among them, Adam is the most widely used algorithm. However, studies have shown that Adam’s generalization ability is weak. AdaX is a variant of Adam, which introduces a novel second-order momentum, modifies the second-order moment of Adam, and has good generalization ability. However, these algorithms may fail to converge due to instability and extreme learning rates during training. In this paper, we propose a new adaptive and momental bound algorithm, called AdaXod, which characterizes of exponentially averaging the learning rate and is particularly useful for training deep neural networks. By setting an adaptively limited learning rate in the AdaX algorithm, the resultant AdaXod can effectively eliminate the problem of excessive learning rate in the later stage of neural networks training and thus results in stable training. We conduct extensive experiments on different datasets and verify the advantages of the AdaXod algorithm by comparing with other advanced adaptive optimization algorithms. AdaXod eliminates large learning rates during neural networks training and outperforms other optimizers, especially for some neural networks with complex structures, such as DenseNet.
引用
收藏
页码:17691 / 17715
页数:24
相关论文
共 50 条
  • [21] Adaptive algorithm for training pRAM neural networks on unbalanced data sets
    Ramanan, S
    Clarkson, TG
    Taylor, JG
    [J]. ELECTRONICS LETTERS, 1998, 34 (13) : 1335 - 1336
  • [22] Adaptive nonmonotone conjugate gradient training algorithm for recurrent neural networks
    Peng, Chun-Cheng
    Magoulas, George D.
    [J]. 19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 374 - 381
  • [23] Adaptive algorithm for training pRAM neural networks on unbalanced data sets
    Univ of Essex, Essex, United Kingdom
    [J]. Electron Lett, 13 (1335-1336):
  • [24] An Efficient Learning Algorithm for Direct Training Deep Spiking Neural Networks
    Zhu, Xiaolei
    Zhao, Baixin
    Ma, De
    Tang, Huajin
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (03) : 847 - 856
  • [25] Fast-M Adversarial Training Algorithm for Deep Neural Networks
    Ma, Yu
    An, Dou
    Gu, Zhixiang
    Lin, Jie
    Liu, Weiyu
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (11):
  • [26] A New Variant of the GQR Algorithm for Feedforward Neural Networks Training
    Bilski, Jaroslaw
    Kowalczyk, Bartosz
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT I, 2021, 12854 : 41 - 53
  • [27] A new constructive algorithm for designing and training artificial neural networks
    Sattar, Md. Abdus
    Islam, Md. Monirul
    Murase, Kazuyuki
    [J]. NEURAL INFORMATION PROCESSING, PART I, 2008, 4984 : 317 - +
  • [28] New hierarchical genetic algorithm for training of RBF neural networks
    Zheng, Pi'e
    Ma, Yanhua
    [J]. Kongzhi yu Juece/Control and Decision, 2000, 15 (02): : 165 - 168
  • [29] Exploiting nonlinear dendritic adaptive computation in training deep Spiking Neural Networks
    Shen, Guobin
    Zhao, Dongcheng
    Zeng, Yi
    [J]. NEURAL NETWORKS, 2024, 170 : 190 - 201
  • [30] Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
    Chen, Jinghui
    Zhou, Dongruo
    Tang, Yiqi
    Yang, Ziyan
    Cao, Yuan
    Gu, Quanquan
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3267 - 3275