Design of activation function in CNN for image classification

被引:6
|
作者
Wang H.-X. [1 ]
Zhou J.-Q. [1 ]
Gu C.-H. [1 ]
Lin H. [1 ]
机构
[1] School of Computer Science and Technology, Wuhan University of Technology, Wuhan
关键词
Activation function; Combinatorial activation function; Convolutional neural network; Image classification; Neurons necrosis; Relu;
D O I
10.3785/j.issn.1008-973X.2019.07.016
中图分类号
学科分类号
摘要
A new combinatorial activation function called relu-softsign was proposed aiming at the problem that the derivative of the commonly used activation function relu in the convolutional neural network is constant to zero at the x negative axis, which makes it easy to cause neuron necrosis during training, and the existing combinatorial activation function relu-softplus can only use the small learning rate in the case of model convergence, which leads to slow convergence. The image classification effect was improved. The role of the activation function during training was analyzed, and the key points that need to be considered in the design of the activation function were given. The relu and softsign functions were combined piecewise in the positive and negative semi axis of the x axis according to these points, so that the derivative of x negative semi axis was no longer constant to zero. Then comparision with the single activation function and relu-softplus combination activation function was conducted on the MNIST, PI100, CIFAR-100 and Caltech256 datasets. The experimental results show that the combinatorial activation function relu-softsign improves the model classification accuracy, simply and effectively mitigates the irreversible " necrosis" phenomenon of neurons. The convergence speed of the model is accelerated, especially on complex data sets. © 2019, Zhejiang University Press. All right reserved.
引用
收藏
页码:1363 / 1373
页数:10
相关论文
共 17 条
  • [1] Huang K.-Q., Ren W.-Q., Tan T.-N., A review on image object classification and detection, Chinese Journal of Computers, 36, 6, pp. 1225-1240, (2014)
  • [2] Chang L., Deng X.-M., Zhou M.-Q., Et al., Convolution neural network in image understanding, Acta Automatica Sinica, 42, 9, pp. 1300-1312, (2016)
  • [3] Wu Z.-W., Application of convolution neural network in image classification, (2015)
  • [4] Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, International Conference on Neural Information Processing Systems, pp. 1097-1105, (2012)
  • [5] Nair V., Hinton G.E., Rectified linear units improve restricted Boltzmann machines, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807-814, (2010)
  • [6] Dolezel P., Skrabanek P., Gago L., Weight initialization possibilities for feedforward neural network with linear saturated activation functions, IFAC-PapersOnLine, 49, 25, pp. 49-54, (2016)
  • [7] Maas A.L., Hannun A.Y., Ng A.Y., Rectifier nonlinearities improve neural network acoustic models, Proceedings of the 30th International Conference on Machine Learning, pp. 456-462, (2013)
  • [8] Clevert D.A., Unterthiner T., Hochreiter S., Fast and accurate deep network learning by exponential linear units (ELUs), Computer Science, 5, 2, pp. 716-730, (2015)
  • [9] He K., Zhang X., Ren S., Et al., Delving deep into rectifiers: surpassing human-level performance on ImageNet classification, Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034, (2015)
  • [10] Shi Q., Research and verification of image classification optimization algorithm based on convolutional neural network, (2017)