On the Impact of the Activation Function on Deep Neural Networks Training

被引：0

作者：

Hayou, Soufiane ^{[1
]}

Doucet, Arnaud ^{[1
]}

Rousseau, Judith ^{[1
]}

机构：

[1] Univ Oxford, Dept Stat, Oxford, England

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by (Schoenholz et al., 2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the 'Edge of Chaos' can lead to good performance. While the work by (Schoenholz et al., 2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune the initialization parameters and the activation function in order to accelerate the training and improve performance.

引用

页数：9

共 50 条

[21] A parameterized activation function for learning fuzzy logic operations in deep neural networks
Godfrey, Luke B.
Gashler, Michael S.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 740 - 745
[22] Periodic Function as Activation Function for Neural Networks
Xu, Ding
Guan, Yue
Cai, Ping-ping
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: TECHNIQUES AND APPLICATIONS, AITA 2016, 2016, : 179 - 183
[23] Is normalization indispensable for training deep neural networks?
Shao, Jie
Hu, Kai
Wang, Changhu
Xue, Xiangyang
Raj, Bhiksha
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[24] Exploiting Invariance in Training Deep Neural Networks
Ye, Chengxi
Zhou, Xiong
McKinney, Tristan
Liu, Yanfeng
Zhou, Qinggang
Zhdanov, Fedor
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8849 - 8856
[25] On Calibration of Mixup Training for Deep Neural Networks
Maronas, Juan
Ramos, Daniel
Paredes, Roberto
[J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 67 - 76
[26] Training Deep Neural Networks with Gradual Deconvexification
Lo, Jawes Ting-Ho
Gui, Yichuan
Peng, Yun
[J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1000 - 1007
[27] Training Deep Neural Networks for Visual Servoing
Bateux, Quentin
Marchand, Eric
Leitner, Jurgen
Chaumette, Francois
Corke, Peter
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 3307 - 3314
[28] Local Critic Training of Deep Neural Networks
Lee, Hojung
Lee, Jong-Seok
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[29] An Optimization Strategy for Deep Neural Networks Training
Wu, Tingting
Zeng, Peng
Song, Chunhe
[J]. 2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 596 - 603
[30] Exploring Strategies for Training Deep Neural Networks
Larochelle, Hugo
Bengio, Yoshua
Louradour, Jerome
Lamblin, Pascal
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1 - 40

← 1 2 3 4 5 →