RECURRENT NEURAL NETWORKS WITH FLEXIBLE GATES USING KERNEL ACTIVATION FUNCTIONS

被引:0
|
作者
Scardapane, Simone [1 ]
Van Vaerenbergh, Steven [2 ]
Comminiello, Danilo [1 ]
Totaro, Simone [1 ]
Uncini, Aurelio [1 ]
机构
[1] Sapienza Univ Rome, Rome, Italy
[2] Univ Cantabria, Santander, Spain
关键词
Recurrent network; LSTM; GRU; Gate; Kernel activation function;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gated recurrent neural networks have achieved remarkable results in the analysis of sequential data. Inside these networks, gates are used to control the flow of information, allowing to model even very long-term dependencies in the data. In this paper, we investigate whether the original gate equation (a linear projection followed by an element-wise sigmoid) can be improved. In particular, we design a more flexible architecture, with a small number of adaptable parameters, which is able to model a wider range of gating functions than the classical one. To this end, we replace the sigmoid function in the standard gate with a non-parametric formulation extending the recently proposed kernel activation function (KAF), with the addition of a residual skip-connection. A set of experiments on sequential variants of the MNIST dataset shows that the adoption of this novel gate allows to improve accuracy with a negligible cost in terms of computational power and with a large speed-up in the number of training iterations.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Placing spline knots in neural networks using splines as activation functions
    Hlavackova, K
    Verleysen, M
    NEUROCOMPUTING, 1997, 17 (3-4) : 159 - 166
  • [32] Statistical approximation learning of discontinuous functions using simultaneous recurrent neural networks
    Sakai, M
    Homma, N
    Gupta, MI
    Abe, K
    PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL, 2002, : 434 - 439
  • [33] Global exponential stability for recurrent neural networks with a general class of activation functions and variable delays
    Zhou, DM
    Zhang, LM
    Zhao, DF
    PROCEEDINGS OF 2003 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS & SIGNAL PROCESSING, PROCEEDINGS, VOLS 1 AND 2, 2003, : 108 - 111
  • [34] Recurrent Kernel Networks
    Chen, Dexiong
    Jacob, Laurent
    Mairal, Julien
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [35] Multistability analysis of complex-valued recurrent neural networks with sine and cosine activation functions
    Yang, Liu
    Gong, Weiqiang
    Li, Qiang
    Sun, Fanrong
    Xing, Mali
    NEUROCOMPUTING, 2024, 577
  • [36] A Lyapunov-Based Method of Reducing Activation Functions of Recurrent Neural Networks for Stability Analysis
    Yuno, Tsuyoshi
    Fukuchi, Kazuma
    Ebihara, Yoshio
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 1102 - 1107
  • [37] Complete Stability of Delayed Recurrent Neural Networks With New Wave-Type Activation Functions
    Yan, Zepeng
    Sun, Wen
    Guo, Wanli
    Li, Biwen
    Wen, Shiping
    Cao, Jinde
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [38] Exponential convergence for high-order recurrent neural networks with a class of general activation functions
    Zhang, Hong
    Wang, Wentao
    Xiao, Bing
    APPLIED MATHEMATICAL MODELLING, 2011, 35 (01) : 123 - 129
  • [39] Multistability of Recurrent Neural Networks With Nonmonotonic Activation Functions and Unbounded Time-Varying Delays
    Liu, Peng
    Zeng, Zhigang
    Wang, Jun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 3000 - 3010
  • [40] Multistability of discrete-time recurrent neural networks with unsaturating piecewise linear activation functions
    Yi, Z
    Tan, KK
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (02): : 329 - 336