On the Impact of the Activation Function on Deep Neural Networks Training

被引：0

作者：

Hayou, Soufiane ^{[1
]}

Doucet, Arnaud ^{[1
]}

Rousseau, Judith ^{[1
]}

机构：

[1] Univ Oxford, Dept Stat, Oxford, England

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by (Schoenholz et al., 2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the 'Edge of Chaos' can lead to good performance. While the work by (Schoenholz et al., 2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune the initialization parameters and the activation function in order to accelerate the training and improve performance.

引用

页数：9

共 50 条

[1] The Impact of Architecture on the Deep Neural Networks Training
Rozycki, Pawel
Kolbusz, Janusz
Malinowski, Aleksander
Wilamowski, Bogdan
2019 12TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2019, : 41 - 46
[2] RSigELU: A nonlinear activation function for deep neural networks
Kilicarslan, Serhat
Celik, Mete
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174 (174)
[3] The impact of activation functions on training and performance of a deep neural network
Marcu, David C.
Grava, Cristian
2021 16TH INTERNATIONAL CONFERENCE ON ENGINEERING OF MODERN ELECTRIC SYSTEMS (EMES), 2021, : 126 - 129
[4] An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks
Chai, Enhui
Yu, Wei
Cui, Tianxiang
Ren, Jianfeng
Ding, Shusheng
SYMMETRY-BASEL, 2022, 14 (05):
[5] Regularized Flexible Activation Function Combination for Deep Neural Networks
Jie, Renlong
Gao, Junbin
Vasnev, Andrey
Tran, Minh-ngoc
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2001 - 2008
[6] NIPUNA: A Novel Optimizer Activation Function for Deep Neural Networks
Madhu, Golla
Kautish, Sandeep
Alnowibet, Khalid Abdulaziz
Zawbaa, Hossam M. M.
Mohamed, Ali Wagdy
AXIOMS, 2023, 12 (03)
[7] Serf: Towards better training of deep neural networks using log-Softplus ERror activation Function
Nag, Sayan
Bhattacharyya, Mayukh
Mukherjee, Anuraag
Kundu, Rohit
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5313 - 5322
[8] NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION
Schmidt-Hieber, Johannes
ANNALS OF STATISTICS, 2020, 48 (04): : 1875 - 1897
[9] Approximating smooth functions by deep neural networks with sigmoid activation function
Langer, Sophie
JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 182
[10] Smooth Function Approximation by Deep Neural Networks with General Activation Functions
Ohn, Ilsang
Kim, Yongdai
ENTROPY, 2019, 21 (07)

← 1 2 3 4 5 →