Parametric RSigELU: a new trainable activation function for deep learning

被引：0

作者：

Kilicarslan, Serhat ^{[1
]}

Celik, Mete ^{[2
]}

机构：

[1] Bandirma Onyedi Eylul Univ, Dept Software Engn, TR-10200 Balikesir, Turkiye

[2] Erciyes Univ, Fac Engn, Dept Comp Engn, TR-38039 Kayseri, Turkiye

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 13期

关键词：

Deep learning; Parametric activation function (P plus RSigELU); MNIST; CIFAR-10; CIFAR-100; Trainable activation function;

D O I：

10.1007/s00521-024-09538-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Activation functions are used to extract meaningful relationships from real-world problems with the help of deep learning models. Thus, the development of activation functions which affect deep learning models' performances is of great interest to researchers. In the literature, mostly, nonlinear activation functions are preferred since linear activation functions limit the learning performances of the deep learning models. Non-linear activation functions can be classified as fixed-parameter and trainable activation functions based on whether the activation function parameter is fixed (i.e., user-given) or modified during the training process of deep learning models. The parameters of the fixed-parameter activation functions should be specified before the deep learning model training process. However, it takes too much time to determine appropriate function parameter values and can cause the slow convergence of the deep learning model. In contrast, trainable activation functions whose parameters are updated in each iteration of deep learning models training process achieve faster and better convergence by obtaining the most suitable parameter values for the datasets and deep learning architectures. This study proposes parametric RSigELU (P+RSigELU) trainable activation functions, such as P+RSigELU Single (P+RSigELUS) and P+RSigELU Double (P+RSigELUD), to improve the performance of fixed-parameter activation function of RSigELU. The performances of the proposed trainable activation functions were evaluated on the benchmark datasets of MNIST, CIFAR-10, and CIFAR-100 datasets. Results show that the proposed activation functions outperforms PReLU, PELU, ALISA, P+FELU, PSigmoid, and GELU activation functions found in the literature. To access the codes of the activation function; https://github.com/serhatklc/P-RsigELU-Activation-Function.

引用

页码：7595 / 7607

页数：13

共 50 条

[1] RSigELU: A nonlinear activation function for deep neural networks
Kilicarslan, Serhat
Celik, Mete
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
[2] KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks
Serhat Kiliçarslan
Mete Celik
[J]. Neural Computing and Applications, 2022, 34 : 13909 - 13923
[3] TeLU: A New Activation Function for Deep Learning
Mercioni, Marina Adriana
Holban, Stefan
[J]. 2020 14TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2020, : 32 - 35
[4] Hyperspectral Imagery Denoising by Deep Learning With Trainable Nonlinearity Function
Xie, Weiying
Li, Yunsong
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (11) : 1963 - 1967
[5] KAF plus RSigELU: a nonlinear and kernel-based activation function for deep neural networks
Kilicarslan, Serhat
Celik, Mete
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (16): : 13909 - 13923
[6] Beyond weights adaptation: A new neuron model with trainable activation function and its supervised learning
Wu, YS
Zhao, MS
Ding, XQ
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 1152 - 1157
[7] An Evaluation of Parametric Activation Functions for Deep Learning
Godfrey, Luke B.
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3006 - 3011
[8] A neuron model with trainable activation function (TAF) and its MFNN supervised learning
吴佑寿
赵明生
[J]. Science China(Information Sciences), 2001, (05) : 366 - 375
[9] A neuron model with trainable activation function (TAF) and its MFNN supervised learning
Youshou Wu
Mingsheng Zhao
[J]. Science in China Series : Information Sciences, 2001, 44 (5): : 366 - 375
[10] A Universal Activation Function for Deep Learning
Hwang, Seung-Yeon
Kim, Jeong-Joon
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3553 - 3569

← 1 2 3 4 5 →