Learning a Single Neuron for Non-monotonic Activation Functions

被引:0
|
作者
Wu, Lei [1 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
关键词
EMPIRICAL RISK; LANDSCAPE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of learning a single neuron x -> sigma(w(T) x) with gradient descent (GD). All the existing positive results are limited to the case where sigma is monotonic. However, it is recently observed that non-monotonic activation functions outperform the traditional monotonic ones in many applications. To fill this gap, we establish learnability without assuming monotonicity. Specifically, when the input distribution is the standard Gaussian, we show that mild conditions on sigma (e.g., sigma has a dominating linear part) are sufficient to guarantee the learnability in polynomial time and polynomial samples. Moreover, with a stronger assumption on the activation function, the condition of input distribution can be relaxed to a non-degeneracy of the marginal distribution. We remark that our conditions on sigma are satisfied by practical non-monotonic activation functions, such as SiLU/Swish and GELU. We also discuss how our positive results are related to existing negative results on training two-layer neural networks.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Supervised learning for multilayered neural network with non-monotonic activation functions
    Kotani, M
    Akazawa, K
    [J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 13 - 22
  • [2] Hopf bifurcation and chaos in a single delayed neuron equation with non-monotonic activation function
    Liao, XF
    Wong, KW
    Leung, CS
    Wu, ZF
    [J]. CHAOS SOLITONS & FRACTALS, 2001, 12 (08) : 1535 - 1547
  • [3] Non-monotonic Explanation Functions
    Amgoud, Leila
    [J]. SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2021, 2021, 12897 : 19 - 31
  • [4] α­SechSig and α­TanhSig: two novel non-monotonic activation functions
    Cemil Közkurt
    Serhat Kiliçarslan
    Selçuk Baş
    Abdullah Elen
    [J]. Soft Computing, 2023, 27 : 18451 - 18467
  • [5] ErfAct and Pserf: Non-monotonic Smooth Trainable Activation Functions
    Biswas, Koushik
    Kumar, Sandeep
    Banerjee, Shilpak
    Pandey, Ashish Kumar
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6097 - 6105
  • [6] αSechSig and αTanhSig: two novel non-monotonic activation functions
    Kozkurt, Cemil
    Kilicarslan, Serhat
    Bas, Selcuk
    Elen, Abdullah
    [J]. SOFT COMPUTING, 2023, 27 (24) : 18451 - 18467
  • [7] Circuit implementation for non-monotonic chaotic neuron
    [J]. Bandaoti Guangdian/Semiconductor Optoelectronics, 1997, 18 (05): : 302 - 306
  • [8] Learning non-monotonic additive value functions for multicriteria decision making
    Doumpos, Michael
    [J]. OR SPECTRUM, 2012, 34 (01) : 89 - 106
  • [9] Learning non-monotonic additive value functions for multicriteria decision making
    Michael Doumpos
    [J]. OR Spectrum, 2012, 34 : 89 - 106
  • [10] Multistability of Memristive Neural Networks with Non-monotonic Piecewise Linear Activation Functions
    Nie, Xiaobing
    Cao, Jinde
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2015, 2015, 9377 : 182 - 191