AdaXod: a new adaptive and momental bound algorithm for training deep neural networks

被引:0
|
作者
Yuanxuan Liu
Dequan Li
机构
[1] Anhui University of Science and Technology,School of Mathematics and Big Data
[2] Anhui University of Science and Technology,School of Artificial Intelligence
来源
关键词
Adaptive algorithm; Deep neural network; Image classification; Adaptive and momental bound;
D O I
暂无
中图分类号
学科分类号
摘要
Adaptive algorithms are widely used in deep learning because of their fast convergence. Among them, Adam is the most widely used algorithm. However, studies have shown that Adam’s generalization ability is weak. AdaX is a variant of Adam, which introduces a novel second-order momentum, modifies the second-order moment of Adam, and has good generalization ability. However, these algorithms may fail to converge due to instability and extreme learning rates during training. In this paper, we propose a new adaptive and momental bound algorithm, called AdaXod, which characterizes of exponentially averaging the learning rate and is particularly useful for training deep neural networks. By setting an adaptively limited learning rate in the AdaX algorithm, the resultant AdaXod can effectively eliminate the problem of excessive learning rate in the later stage of neural networks training and thus results in stable training. We conduct extensive experiments on different datasets and verify the advantages of the AdaXod algorithm by comparing with other advanced adaptive optimization algorithms. AdaXod eliminates large learning rates during neural networks training and outperforms other optimizers, especially for some neural networks with complex structures, such as DenseNet.
引用
收藏
页码:17691 / 17715
页数:24
相关论文
共 50 条
  • [1] AdaXod: a new adaptive and momental bound algorithm for training deep neural networks
    Liu, Yuanxuan
    Li, Dequan
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (15): : 17691 - 17715
  • [2] A fast adaptive algorithm for training deep neural networks
    Gui, Yangting
    Li, Dequan
    Fang, Runyue
    [J]. APPLIED INTELLIGENCE, 2023, 53 (04) : 4099 - 4108
  • [3] A fast adaptive algorithm for training deep neural networks
    Yangting Gui
    Dequan Li
    Runyue Fang
    [J]. Applied Intelligence, 2023, 53 : 4099 - 4108
  • [4] An Adaptive Layer Expansion Algorithm for Efficient Training of Deep Neural Networks
    Chen, Yi-Long
    Liu, Pangfeng
    Wu, Jan-Jan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 420 - 425
  • [5] AN ADAPTIVE TRAINING ALGORITHM FOR BACKPROPAGATION NEURAL NETWORKS
    HSIN, HC
    LI, CC
    SUN, MG
    SCLABASSI, RJ
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1995, 25 (03): : 512 - 514
  • [6] SPEAKER ADAPTIVE TRAINING USING DEEP NEURAL NETWORKS
    Ochiai, Tsubasa
    Matsuda, Shigeki
    Lu, Xugang
    Hori, Chiori
    Katagiri, Shigeru
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] IMPROVEMENTS TO SPEAKER ADAPTIVE TRAINING OF DEEP NEURAL NETWORKS
    Miao, Yajie
    Jiang, Lu
    Zhang, Hao
    Metze, Florian
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 165 - 170
  • [8] The adaptive fuzzy training algorithm for feedforward neural networks
    Xie, P.
    Liu, B.
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2001, 23 (07): : 79 - 82
  • [9] Hybrid pre training algorithm of Deep Neural Networks
    Drokin, I. S.
    [J]. 6TH SEMINAR ON INDUSTRIAL CONTROL SYSTEMS: ANALYSIS, MODELING AND COMPUTATION, 2016, 6
  • [10] An efficient bandwidth-adaptive gradient compression algorithm for distributed training of deep neural networks
    Wang, Zeqin
    Duan, Qingyang
    Xu, Yuedong
    Zhang, Liang
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 150