Parametric RSigELU: a new trainable activation function for deep learning

被引：0

作者：

Kilicarslan, Serhat ^{[1
]}

Celik, Mete ^{[2
]}

机构：

[1] Bandirma Onyedi Eylul Univ, Dept Software Engn, TR-10200 Balikesir, Turkiye

[2] Erciyes Univ, Fac Engn, Dept Comp Engn, TR-38039 Kayseri, Turkiye

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 13期

关键词：

Deep learning; Parametric activation function (P plus RSigELU); MNIST; CIFAR-10; CIFAR-100; Trainable activation function;

D O I：

10.1007/s00521-024-09538-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Activation functions are used to extract meaningful relationships from real-world problems with the help of deep learning models. Thus, the development of activation functions which affect deep learning models' performances is of great interest to researchers. In the literature, mostly, nonlinear activation functions are preferred since linear activation functions limit the learning performances of the deep learning models. Non-linear activation functions can be classified as fixed-parameter and trainable activation functions based on whether the activation function parameter is fixed (i.e., user-given) or modified during the training process of deep learning models. The parameters of the fixed-parameter activation functions should be specified before the deep learning model training process. However, it takes too much time to determine appropriate function parameter values and can cause the slow convergence of the deep learning model. In contrast, trainable activation functions whose parameters are updated in each iteration of deep learning models training process achieve faster and better convergence by obtaining the most suitable parameter values for the datasets and deep learning architectures. This study proposes parametric RSigELU (P+RSigELU) trainable activation functions, such as P+RSigELU Single (P+RSigELUS) and P+RSigELU Double (P+RSigELUD), to improve the performance of fixed-parameter activation function of RSigELU. The performances of the proposed trainable activation functions were evaluated on the benchmark datasets of MNIST, CIFAR-10, and CIFAR-100 datasets. Results show that the proposed activation functions outperforms PReLU, PELU, ALISA, P+FELU, PSigmoid, and GELU activation functions found in the literature. To access the codes of the activation function; https://github.com/serhatklc/P-RsigELU-Activation-Function.

引用

页码：7595 / 7607

页数：13

共 50 条

[41] Unscented Trainable Kalman Filter Based on Deep Learning Method Considering Incomplete Information
Yu, Yanjie
Li, Qiang
Zhang, Houyi
[J]. IEEE ACCESS, 2023, 11 : 50700 - 50709
[42] Deep-Learning Software Reliability Model Using SRGM as Activation Function
Kim, Youn Su
Pham, Hoang
Chang, In Hong
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
[43] A parameterized activation function for learning fuzzy logic operations in deep neural networks
Godfrey, Luke B.
Gashler, Michael S.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 740 - 745
[44] Deep Learning Based DOA Estimation With Trainable-Step-Size LMS Algorithm
Guo, Yu
Zhang, Zhi
Huang, Yuzhen
[J]. 2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC, 2023,
[45] Deep Learning of Activation Energies
Grambow, Colin A.
Pattanaik, Lagnajit
Green, William H.
[J]. JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2020, 11 (08): : 2992 - 2997
[46] P + FELU: Flexible and trainable fast exponential linear unit for deep learning architectures
Kemal Adem
[J]. Neural Computing and Applications, 2022, 34 : 21729 - 21740
[47] Deep Reinforcement Learning with Parametric Episodic Memory
Chen, Kangkang
Gan, Zhongxue
Leng, Siyang
Guan, Chun
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[48] A New Class of Polynomial Activation Functions of Deep Learning for Precipitation Forecasting
Wang, Jiachuan
Chen, Lei
Ng, Charles Wang Wai
[J]. WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1025 - 1035
[49] Deep Dictionary Learning: A PARametric NETwork Approach
Mahdizadehaghdam, Shahin
Panahi, Ashkan
Krim, Hamid
Dai, Liyi
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) : 4790 - 4802
[50] Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions
Siniscalchi, Sabato Marco
Salerno, Valerio Mario
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (08) : 1959 - 1965

← 1 2 3 4 5 →