Parametric RSigELU: a new trainable activation function for deep learning

被引:0
|
作者
Kilicarslan, Serhat [1 ]
Celik, Mete [2 ]
机构
[1] Bandirma Onyedi Eylul Univ, Dept Software Engn, TR-10200 Balikesir, Turkiye
[2] Erciyes Univ, Fac Engn, Dept Comp Engn, TR-38039 Kayseri, Turkiye
来源
NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 13期
关键词
Deep learning; Parametric activation function (P plus RSigELU); MNIST; CIFAR-10; CIFAR-100; Trainable activation function;
D O I
10.1007/s00521-024-09538-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Activation functions are used to extract meaningful relationships from real-world problems with the help of deep learning models. Thus, the development of activation functions which affect deep learning models' performances is of great interest to researchers. In the literature, mostly, nonlinear activation functions are preferred since linear activation functions limit the learning performances of the deep learning models. Non-linear activation functions can be classified as fixed-parameter and trainable activation functions based on whether the activation function parameter is fixed (i.e., user-given) or modified during the training process of deep learning models. The parameters of the fixed-parameter activation functions should be specified before the deep learning model training process. However, it takes too much time to determine appropriate function parameter values and can cause the slow convergence of the deep learning model. In contrast, trainable activation functions whose parameters are updated in each iteration of deep learning models training process achieve faster and better convergence by obtaining the most suitable parameter values for the datasets and deep learning architectures. This study proposes parametric RSigELU (P+RSigELU) trainable activation functions, such as P+RSigELU Single (P+RSigELUS) and P+RSigELU Double (P+RSigELUD), to improve the performance of fixed-parameter activation function of RSigELU. The performances of the proposed trainable activation functions were evaluated on the benchmark datasets of MNIST, CIFAR-10, and CIFAR-100 datasets. Results show that the proposed activation functions outperforms PReLU, PELU, ALISA, P+FELU, PSigmoid, and GELU activation functions found in the literature. To access the codes of the activation function; https://github.com/serhatklc/P-RsigELU-Activation-Function.
引用
收藏
页码:7595 / 7607
页数:13
相关论文
共 50 条
  • [41] Unscented Trainable Kalman Filter Based on Deep Learning Method Considering Incomplete Information
    Yu, Yanjie
    Li, Qiang
    Zhang, Houyi
    [J]. IEEE ACCESS, 2023, 11 : 50700 - 50709
  • [42] Deep-Learning Software Reliability Model Using SRGM as Activation Function
    Kim, Youn Su
    Pham, Hoang
    Chang, In Hong
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [43] A parameterized activation function for learning fuzzy logic operations in deep neural networks
    Godfrey, Luke B.
    Gashler, Michael S.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 740 - 745
  • [44] Deep Learning Based DOA Estimation With Trainable-Step-Size LMS Algorithm
    Guo, Yu
    Zhang, Zhi
    Huang, Yuzhen
    [J]. 2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC, 2023,
  • [45] Deep Learning of Activation Energies
    Grambow, Colin A.
    Pattanaik, Lagnajit
    Green, William H.
    [J]. JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2020, 11 (08): : 2992 - 2997
  • [46] P + FELU: Flexible and trainable fast exponential linear unit for deep learning architectures
    Kemal Adem
    [J]. Neural Computing and Applications, 2022, 34 : 21729 - 21740
  • [47] Deep Reinforcement Learning with Parametric Episodic Memory
    Chen, Kangkang
    Gan, Zhongxue
    Leng, Siyang
    Guan, Chun
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [48] A New Class of Polynomial Activation Functions of Deep Learning for Precipitation Forecasting
    Wang, Jiachuan
    Chen, Lei
    Ng, Charles Wang Wai
    [J]. WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1025 - 1035
  • [49] Deep Dictionary Learning: A PARametric NETwork Approach
    Mahdizadehaghdam, Shahin
    Panahi, Ashkan
    Krim, Hamid
    Dai, Liyi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) : 4790 - 4802
  • [50] Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions
    Siniscalchi, Sabato Marco
    Salerno, Valerio Mario
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (08) : 1959 - 1965