Empirical Loss Landscape Analysis of Neural Network Activation Functions

被引:1
|
作者
Bosman, Anna Sergeevna [1 ]
Engelbrecht, Andries [2 ,3 ]
Helbig, Marde [4 ]
机构
[1] Univ Pretoria, Dept Comp Sci, Pretoria, South Africa
[2] Univ Stellenbosch, Stellenbosch, South Africa
[3] Gulf Univ Sci & Technol, Ctr Appl Math & Bioinformat, Kuwait, Kuwait
[4] Griffith Univ, Sch Informat & Commun Technol, Southport, Qld, Australia
基金
新加坡国家研究基金会;
关键词
neural networks; activation functions; loss landscape; fitness landscape analysis;
D O I
10.1145/3583133.3596321
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Activation functions play a significant role in neural network design by enabling non-linearity. The choice of activation function was previously shown to influence the properties of the resulting loss landscape. Understanding the relationship between activation functions and loss landscape properties is important for neural architecture and training algorithm design. This study empirically investigates neural network loss landscapes associated with hyperbolic tangent, rectified linear unit, and exponential linear unit activation functions. Rectified linear unit is shown to yield the most convex loss landscape, and exponential linear unit is shown to yield the least flat loss landscape, and to exhibit superior generalisation performance. The presence of wide and narrow valleys in the loss landscape is established for all activation functions, and the narrow valleys are shown to correlate with saturated neurons and implicitly regularised network configurations.
引用
收藏
页码:2029 / 2037
页数:9
相关论文
共 50 条
  • [31] Dynamics analysis and FPGA implementation of discrete memristive cellular neural network with heterogeneous activation functions
    Wang, Chunhua
    Luo, Dingwei
    Deng, Quanli
    Yang, Gang
    CHAOS SOLITONS & FRACTALS, 2024, 187
  • [32] A comparative analysis of various activation functions and optimizers in a convolutional neural network for hyperspectral image classification
    Seyrek E.C.
    Uysal M.
    Multimedia Tools and Applications, 2024, 83 (18) : 53785 - 53816
  • [33] GPU-based Empirical Evaluation of Activation Functions in Convolutional Neural Networks
    Zaheer, Raniah
    Shaziya, Humera
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2018), 2018, : 769 - 773
  • [34] Neural Approximation of Empirical Functions
    Roj, J.
    ACTA PHYSICA POLONICA A, 2013, 124 (03) : 554 - 557
  • [35] Impact Analysis of Different Effective Loss Functions by Using Deep Convolutional Neural Network for Face Recognition
    Nguyen, Anh D.
    Nguyen, Dat T.
    Dao, Hai N.
    Le, Hai H.
    Tran, Nam Q.
    FROM BORN-PHYSICAL TO BORN-VIRTUAL: AUGMENTING INTELLIGENCE IN DIGITAL LIBRARIES, ICADL 2022, 2022, 13636 : 101 - 111
  • [36] A NEURAL-NETWORK TECHNIQUE OF GENERATING EMPIRICAL BIVARIATE DISTRIBUTION-FUNCTIONS
    WANG, SH
    NEURAL PROCESSING LETTERS, 1995, 2 (05) : 14 - 18
  • [37] Complex backpropagation neural network using elementary transcendental activation functions
    Kim, T
    Adali, T
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1281 - 1284
  • [38] Rule Extraction from Artificial Neural Network with Optimized Activation Functions
    Wang Jian-guo
    Yang Jian-hong
    Zhang Wen-xing
    Xu Jin-wu
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 873 - +
  • [39] Multimodal transistors as ReLU activation functions in physical neural network classifiers
    Isin Surekcigil Pesch
    Eva Bestelink
    Olivier de Sagazan
    Adnan Mehonic
    Radu A. Sporea
    Scientific Reports, 12
  • [40] Determination of Activation Functions in A Feedforward Neural Network by using Genetic Algorithm
    Ustun, Oguz
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2009, 15 (03): : 395 - 403