Impact of Mathematical Norms on Convergence of Gradient Descent Algorithms for Deep Neural Networks Learning

被引：1

作者：

Cai, Linzhe ^{[1
]}

Yu, Xinghuo ^{[1
]}

Li, Chaojie ^{[2
]}

Eberhard, Andrew ^{[1
]}

Lien Thuy Nguyen ^{[1
]}

Chuong Thai Doan ^{[1
]}

机构：

[1] RMIT Univ, Sch Engn, Melbourne, Vic 3000, Australia

[2] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

来源：

AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年 / 13728卷

基金：

澳大利亚研究理事会;

关键词：

Infinity norm; Finite-time convergence; Norms equivalence; Deep neural network; FLOWS;

D O I：

10.1007/978-3-031-22695-3_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To improve the performance of gradient descent learning algorithms, the impact of different types of norms is studied for deep neural network training. The performance of different norm types used on both finite-time and fixed-time convergence algorithms are compared. The accuracy of the multiclassification task realized by three typical algorithms using different types of norms is given, and the improvement of Jorge's finite time algorithm with momentum or Nesterov accelerated gradient is also studied. Numerical experiments show that the infinity norm can provide better performance in finite time gradient descent algorithms and give strong robustness under different network structures.

引用

页码：131 / 144

页数：14

共 50 条

[1] Convergence of gradient descent for learning linear neural networks
Nguegnang, Gabin Maxime
Rauhut, Holger
Terstiege, Ulrich
[J]. ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2024, 2024 (01):
[2] Learning dynamics of gradient descent optimization in deep neural networks
Wu, Wei
Jing, Xiaoyuan
Du, Wencai
Chen, Guoliang
[J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
[3] Learning dynamics of gradient descent optimization in deep neural networks
Wei Wu
Xiaoyuan Jing
Wencai Du
Guoliang Chen
[J]. Science China Information Sciences, 2021, 64
[4] Learning dynamics of gradient descent optimization in deep neural networks
Wei WU
Xiaoyuan JING
Wencai DU
Guoliang CHEN
[J]. Science China(Information Sciences), 2021, 64 (05) : 17 - 31
[5] ANALYSIS OF GRADIENT DESCENT LEARNING ALGORITHMS FOR MULTILAYER FEEDFORWARD NEURAL NETWORKS
GUO, H
GELFAND, SB
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1991, 38 (08): : 883 - 894
[6] The convergence of stochastic gradient algorithms applied to learning in neural networks
Stankovic, S
Tadic, V
[J]. AUTOMATION AND REMOTE CONTROL, 1998, 59 (07) : 1002 - 1015
[7] Non-convergence of stochastic gradient descent in the training of deep neural networks
Cheridito, Patrick
Jentzen, Arnulf
Rossmannek, Florian
[J]. JOURNAL OF COMPLEXITY, 2021, 64
[8] Convergence of Stochastic Gradient Descent in Deep Neural Network
Bai-cun Zhou
Cong-ying Han
Tian-de Guo
[J]. Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
[9] Convergence of Stochastic Gradient Descent in Deep Neural Network
Zhou, Bai-cun
Han, Cong-ying
Guo, Tian-de
[J]. ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
[10] Convergence of Stochastic Gradient Descent in Deep Neural Network
Bai-cun ZHOU
Cong-ying HAN
Tian-de GUO
[J]. Acta Mathematicae Applicatae Sinica, 2021, 37 (01) : 126 - 136

← 1 2 3 4 5 →