Impact of Mathematical Norms on Convergence of Gradient Descent Algorithms for Deep Neural Networks Learning

被引:1
|
作者
Cai, Linzhe [1 ]
Yu, Xinghuo [1 ]
Li, Chaojie [2 ]
Eberhard, Andrew [1 ]
Lien Thuy Nguyen [1 ]
Chuong Thai Doan [1 ]
机构
[1] RMIT Univ, Sch Engn, Melbourne, Vic 3000, Australia
[2] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
基金
澳大利亚研究理事会;
关键词
Infinity norm; Finite-time convergence; Norms equivalence; Deep neural network; FLOWS;
D O I
10.1007/978-3-031-22695-3_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To improve the performance of gradient descent learning algorithms, the impact of different types of norms is studied for deep neural network training. The performance of different norm types used on both finite-time and fixed-time convergence algorithms are compared. The accuracy of the multiclassification task realized by three typical algorithms using different types of norms is given, and the improvement of Jorge's finite time algorithm with momentum or Nesterov accelerated gradient is also studied. Numerical experiments show that the infinity norm can provide better performance in finite time gradient descent algorithms and give strong robustness under different network structures.
引用
收藏
页码:131 / 144
页数:14
相关论文
共 50 条
  • [1] Convergence of gradient descent for learning linear neural networks
    Nguegnang, Gabin Maxime
    Rauhut, Holger
    Terstiege, Ulrich
    [J]. ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2024, 2024 (01):
  • [2] Learning dynamics of gradient descent optimization in deep neural networks
    Wu, Wei
    Jing, Xiaoyuan
    Du, Wencai
    Chen, Guoliang
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
  • [3] Learning dynamics of gradient descent optimization in deep neural networks
    Wei Wu
    Xiaoyuan Jing
    Wencai Du
    Guoliang Chen
    [J]. Science China Information Sciences, 2021, 64
  • [4] Learning dynamics of gradient descent optimization in deep neural networks
    Wei WU
    Xiaoyuan JING
    Wencai DU
    Guoliang CHEN
    [J]. Science China(Information Sciences), 2021, 64 (05) : 17 - 31
  • [5] ANALYSIS OF GRADIENT DESCENT LEARNING ALGORITHMS FOR MULTILAYER FEEDFORWARD NEURAL NETWORKS
    GUO, H
    GELFAND, SB
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1991, 38 (08): : 883 - 894
  • [6] The convergence of stochastic gradient algorithms applied to learning in neural networks
    Stankovic, S
    Tadic, V
    [J]. AUTOMATION AND REMOTE CONTROL, 1998, 59 (07) : 1002 - 1015
  • [7] Non-convergence of stochastic gradient descent in the training of deep neural networks
    Cheridito, Patrick
    Jentzen, Arnulf
    Rossmannek, Florian
    [J]. JOURNAL OF COMPLEXITY, 2021, 64
  • [8] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    [J]. Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
  • [9] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    [J]. ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
  • [10] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun ZHOU
    Cong-ying HAN
    Tian-de GUO
    [J]. Acta Mathematicae Applicatae Sinica, 2021, 37 (01) : 126 - 136