On the Impact of the Activation Function on Deep Neural Networks Training

被引:0
|
作者
Hayou, Soufiane [1 ]
Doucet, Arnaud [1 ]
Rousseau, Judith [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is key to identifying which deep networks may be trained successfully as recently demonstrated by (Schoenholz et al., 2017) who showed that for deep feedforward neural networks only a specific choice of hyperparameters known as the 'Edge of Chaos' can lead to good performance. While the work by (Schoenholz et al., 2017) discuss trainability issues, we focus here on training acceleration and overall performance. We give a comprehensive theoretical analysis of the Edge of Chaos and show that we can indeed tune the initialization parameters and the activation function in order to accelerate the training and improve performance.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] A Novel Activation Function of Deep Neural Network
    Xiangyang, Lin
    Xing, Qinghua
    Han, Zhang
    Feng, Chen
    [J]. Scientific Programming, 2023, 2023
  • [42] Leveraging Product as an Activation Function in Deep Networks
    Godfrey, Luke B.
    Gashler, Michael S.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 1617 - 1622
  • [43] Extraction of the Sivers function with deep neural networks
    Fernando, I. P.
    Keller, D.
    [J]. PHYSICAL REVIEW D, 2023, 108 (05)
  • [44] Multistability of neural networks with discontinuous activation function
    Huang, Gan
    Cao, Jinde
    [J]. COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2008, 13 (10) : 2279 - 2289
  • [45] Adaptive Morphing Activation Function for Neural Networks
    Herrera-Alcantara, Oscar
    Arellano-Balderas, Salvador
    [J]. FRACTAL AND FRACTIONAL, 2024, 8 (08)
  • [46] Activation function of wavelet chaotic neural networks
    Xu, Yao-Qun
    Sun, Ming
    Guo, Meng-Shu
    [J]. PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 716 - 721
  • [47] Neural networks with adaptive spline activation function
    Campolucci, P
    Capparelli, F
    Guarnieri, S
    Piazza, F
    Uncini, A
    [J]. MELECON '96 - 8TH MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, PROCEEDINGS, VOLS I-III: INDUSTRIAL APPLICATIONS IN POWER SYSTEMS, COMPUTER SCIENCE AND TELECOMMUNICATIONS, 1996, : 1442 - 1445
  • [48] DYNAMICS OF NEURAL NETWORKS WITH NONMONOTONE ACTIVATION FUNCTION
    DEFELICE, P
    MARANGI, C
    NARDULLI, G
    PASQUARIELLO, G
    TEDESCO, L
    [J]. NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1993, 4 (01) : 1 - 9
  • [49] Activation function of transiently chaotic neural networks
    Xu, Yaoqun
    Sun, Ming
    Duan, Guangren
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 3004 - +
  • [50] Regularizing Activation Distribution for Training Binarized Deep Networks
    Ding, Ruizhou
    Chin, Ting-Wu
    Liu, Zeye
    Marculescu, Diana
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11400 - 11409