Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network

被引:45
|
作者
Zheng, Qinghe [1 ]
Tian, Xinyu [2 ]
Jiang, Nan [3 ]
Yang, Mingqiang [1 ]
机构
[1] Shandong Univ, Sch Informat Sci & Engn, Qingdao 266237, Shandong, Peoples R China
[2] Shandong Management Univ, Coll Mech & Elect Engn, Jinan, Shandong, Peoples R China
[3] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Hubei, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Deep learning; deep CNNs; non-convex optimization; SGD; layer-wise learning; FUZZY; ALGORITHM;
D O I
10.3233/JIFS-190861
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, despite the popularity of deep convolutional neural networks (CNNs), the efficient training of network models remains challenging due to several problems. In this paper, we present a layer-wise learning based stochastic gradient descent method (LLb-SGD) for gradient-based optimization of objective functions in deep learning, which is simple and computationally efficient. By simulating the cross-media propagation mechanism of light in the natural environment, we set an adaptive learning rate for each layer of neural networks. In order to find the proper local optimum quickly, the dynamic learning sequence spanning different layers adaptively adjust the descending speed of objective function in multi-scale and multi-dimensional environment. To the best of our knowledge, this is the first attempt to introduce an adaptive layer-wise learning schedule with a certain degree of convergence guarantee. Due to its generality and robustness, the method is insensitive to hyper-parameters and therefore can be applied to various network architectures and datasets. Finally, we show promising results compared to other optimization methods on two image classification benchmarks using five standard networks.
引用
收藏
页码:5641 / 5654
页数:14
相关论文
共 50 条
  • [41] Accelerating deep neural network training with inconsistent stochastic gradient descent
    Wang, Linnan
    Yang, Yi
    Min, Renqiang
    Chakradhar, Srimat
    [J]. NEURAL NETWORKS, 2017, 93 : 219 - 229
  • [42] LSDDL: Layer-Wise Sparsification for Distributed Deep Learning
    Hong, Yuxi
    Han, Peng
    [J]. BIG DATA RESEARCH, 2021, 26
  • [43] Dynamic layer-wise sparsification for distributed deep learning
    Zhang, Hao
    Wu, Tingting
    Ma, Zhifeng
    Li, Feng
    Liu, Jie
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 147 : 1 - 15
  • [44] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
    Zhuo, Li'an
    Zhang, Baochang
    Chen, Chen
    Ye, Qixiang
    Liu, Jianzhuang
    Doermann, David
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
  • [45] Interpretable Convolutional Neural Network Through Layer-wise Relevance Propagation for Machine Fault Diagnosis
    Grezmak, John
    Zhang, Jianjing
    Wang, Peng
    Loparo, Kenneth A.
    Gao, Robert X.
    [J]. IEEE SENSORS JOURNAL, 2020, 20 (06) : 3172 - 3181
  • [46] ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining
    Mrazek, Vojtech
    Vasicek, Zdenek
    Sekanina, Lukas
    Hanif, Muhammad Abdullah
    Shafique, Muhammad
    [J]. 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
  • [47] Post-training deep neural network pruning via layer-wise calibration
    Lazarevich, Ivan
    Kozlov, Alexander
    Malinin, Nikita
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
  • [48] Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
    Cik, Ivan
    Rasamoelina, Andrindrasana David
    Mach, Marian
    Sincak, Peter
    [J]. 2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 381 - 386
  • [49] Enriching Variety of Layer-wise Learning Information by Gradient Combination
    Wang, Chien-Yao
    Liao, Hong-Yuan Mark
    Chen, Ping-Yang
    Hsieh, Jun-Wei
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2477 - 2484
  • [50] Interpreting Convolutional Neural Networks via Layer-Wise Relevance Propagation
    Jia, Wohuan
    Zhang, Shaoshuai
    Jiang, Yue
    Xu, Li
    [J]. ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT I, 2022, 13338 : 457 - 467