Mathematical Analysis and Performance Evaluation of the GELU Activation Function in Deep Learning

被引:13
|
作者
Lee, Minhyeok [1 ]
机构
[1] Chung Ang Univ, Sch Elect & Elect Engn, Seoul 06974, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1155/2023/4229924
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian error linear unit (GELU) activation function has emerged as a dominant method, surpassing traditional functions such as the rectified linear unit (ReLU) in various applications. This study presents a rigorous mathematical investigation of the GELU activation function, exploring its differentiability, boundedness, stationarity, and smoothness properties in detail. In addition, we conduct an extensive experimental comparison of the GELU function against a broad range of alternative activation functions, utilizing a residual convolutional network trained on the CIFAR-10, CIFAR-100, and STL-10 datasets as the empirical testbed. Our results demonstrate the superior performance of GELU compared to other activation functions, establishing its suitability for a wide range of deep learning applications. This comprehensive study contributes to a more profound understanding of the underlying mathematical properties of GELU and provides valuable insights for practitioners aiming to select activation functions that optimally align with their specific objectives and constraints in deep learning.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An Empirical Evaluation of Enhanced Performance Softmax Function in Deep Learning
    Mehra, Sumiran
    Raut, Gopal
    Purkayastha, Ribhu Das
    Vishvakarma, Santosh Kumar
    Biasizzo, Anton
    [J]. IEEE ACCESS, 2023, 11 : 34912 - 34924
  • [2] Pilot Maneuvering Performance Analysis and Evaluation with Deep Learning
    Zhang, Shiwen
    Huo, Zhimei
    Sun, Yanjin
    Li, Fujuan
    Jia, Bo
    [J]. INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2023, 2023
  • [3] Analysis and Performance Evaluation of Deep Learning on Big Data
    Matteussi, Kassiano J.
    Zanchetta, Breno F.
    Bertoncello, Germano
    Dos Santos, Jobe D. D.
    Dos Anjos, Julio C. S.
    Geyer, Claudio F. R.
    [J]. 2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 682 - 687
  • [4] A Universal Activation Function for Deep Learning
    Hwang, Seung-Yeon
    Kim, Jeong-Joon
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3553 - 3569
  • [5] REPRODUCING ACTIVATION FUNCTION FOR DEEP LEARNING
    Liang, Senwei
    Lyu, Liyao
    Wang, Chunmei
    Yang, Haizhao
    [J]. COMMUNICATIONS IN MATHEMATICAL SCIENCES, 2024, 22 (02) : 285 - 314
  • [6] Performance Evaluation of Machine Learning and Deep Learning Techniques for Sentiment Analysis
    Mehta, Anushka
    Parekh, Yash
    Karamchandani, Sunil
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, INDIA 2017, 2018, 672 : 463 - 471
  • [7] TeLU: A New Activation Function for Deep Learning
    Mercioni, Marina Adriana
    Holban, Stefan
    [J]. 2020 14TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2020, : 32 - 35
  • [8] An Evaluation of Parametric Activation Functions for Deep Learning
    Godfrey, Luke B.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3006 - 3011
  • [9] Smish: A Novel Activation Function for Deep Learning Methods
    Wang, Xueliang
    Ren, Honge
    Wang, Achuan
    [J]. ELECTRONICS, 2022, 11 (04)
  • [10] An Activation Function with Probabilistic Beltrami Coefficient for Deep Learning
    Shimauchi, Hirokazu
    [J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 613 - 620