On the Impact of the Activation Function on Deep Neural Networks Training

被引:0
|
作者
Hayou, Soufiane [1 ]
Doucet, Arnaud [1 ]
Rousseau, Judith [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by (Schoenholz et al., 2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the 'Edge of Chaos' can lead to good performance. While the work by (Schoenholz et al., 2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune the initialization parameters and the activation function in order to accelerate the training and improve performance.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] The Impact of Architecture on the Deep Neural Networks Training
    Rozycki, Pawel
    Kolbusz, Janusz
    Malinowski, Aleksander
    Wilamowski, Bogdan
    2019 12TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2019, : 41 - 46
  • [2] RSigELU: A nonlinear activation function for deep neural networks
    Kilicarslan, Serhat
    Celik, Mete
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174 (174)
  • [3] The impact of activation functions on training and performance of a deep neural network
    Marcu, David C.
    Grava, Cristian
    2021 16TH INTERNATIONAL CONFERENCE ON ENGINEERING OF MODERN ELECTRIC SYSTEMS (EMES), 2021, : 126 - 129
  • [4] An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks
    Chai, Enhui
    Yu, Wei
    Cui, Tianxiang
    Ren, Jianfeng
    Ding, Shusheng
    SYMMETRY-BASEL, 2022, 14 (05):
  • [5] Regularized Flexible Activation Function Combination for Deep Neural Networks
    Jie, Renlong
    Gao, Junbin
    Vasnev, Andrey
    Tran, Minh-ngoc
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2001 - 2008
  • [6] NIPUNA: A Novel Optimizer Activation Function for Deep Neural Networks
    Madhu, Golla
    Kautish, Sandeep
    Alnowibet, Khalid Abdulaziz
    Zawbaa, Hossam M. M.
    Mohamed, Ali Wagdy
    AXIOMS, 2023, 12 (03)
  • [7] Serf: Towards better training of deep neural networks using log-Softplus ERror activation Function
    Nag, Sayan
    Bhattacharyya, Mayukh
    Mukherjee, Anuraag
    Kundu, Rohit
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5313 - 5322
  • [8] NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION
    Schmidt-Hieber, Johannes
    ANNALS OF STATISTICS, 2020, 48 (04): : 1875 - 1897
  • [9] Approximating smooth functions by deep neural networks with sigmoid activation function
    Langer, Sophie
    JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 182
  • [10] Smooth Function Approximation by Deep Neural Networks with General Activation Functions
    Ohn, Ilsang
    Kim, Yongdai
    ENTROPY, 2019, 21 (07)