Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network

被引：45

作者：

Zheng, Qinghe ^{[1
]}

Tian, Xinyu ^{[2
]}

Jiang, Nan ^{[3
]}

Yang, Mingqiang ^{[1
]}

机构：

[1] Shandong Univ, Sch Informat Sci & Engn, Qingdao 266237, Shandong, Peoples R China

[2] Shandong Management Univ, Coll Mech & Elect Engn, Jinan, Shandong, Peoples R China

[3] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Hubei, Peoples R China

来源：

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS | 2019年 / 37卷 / 04期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Deep learning; deep CNNs; non-convex optimization; SGD; layer-wise learning; FUZZY; ALGORITHM;

D O I：

10.3233/JIFS-190861

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nowadays, despite the popularity of deep convolutional neural networks (CNNs), the efficient training of network models remains challenging due to several problems. In this paper, we present a layer-wise learning based stochastic gradient descent method (LLb-SGD) for gradient-based optimization of objective functions in deep learning, which is simple and computationally efficient. By simulating the cross-media propagation mechanism of light in the natural environment, we set an adaptive learning rate for each layer of neural networks. In order to find the proper local optimum quickly, the dynamic learning sequence spanning different layers adaptively adjust the descending speed of objective function in multi-scale and multi-dimensional environment. To the best of our knowledge, this is the first attempt to introduce an adaptive layer-wise learning schedule with a certain degree of convergence guarantee. Due to its generality and robustness, the method is insensitive to hyper-parameters and therefore can be applied to various network architectures and datasets. Finally, we show promising results compared to other optimization methods on two image classification benchmarks using five standard networks.

引用

页码：5641 / 5654

页数：14

共 50 条

[41] Accelerating deep neural network training with inconsistent stochastic gradient descent
Wang, Linnan
Yang, Yi
Min, Renqiang
Chakradhar, Srimat
[J]. NEURAL NETWORKS, 2017, 93 : 219 - 229
[42] LSDDL: Layer-Wise Sparsification for Distributed Deep Learning
Hong, Yuxi
Han, Peng
[J]. BIG DATA RESEARCH, 2021, 26
[43] Dynamic layer-wise sparsification for distributed deep learning
Zhang, Hao
Wu, Tingting
Ma, Zhifeng
Li, Feng
Liu, Jie
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 147 : 1 - 15
[44] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
Zhuo, Li'an
Zhang, Baochang
Chen, Chen
Ye, Qixiang
Liu, Jianzhuang
Doermann, David
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
[45] Interpretable Convolutional Neural Network Through Layer-wise Relevance Propagation for Machine Fault Diagnosis
Grezmak, John
Zhang, Jianjing
Wang, Peng
Loparo, Kenneth A.
Gao, Robert X.
[J]. IEEE SENSORS JOURNAL, 2020, 20 (06) : 3172 - 3181
[46] ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining
Mrazek, Vojtech
Vasicek, Zdenek
Sekanina, Lukas
Hanif, Muhammad Abdullah
Shafique, Muhammad
[J]. 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
[47] Post-training deep neural network pruning via layer-wise calibration
Lazarevich, Ivan
Kozlov, Alexander
Malinin, Nikita
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
[48] Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
Cik, Ivan
Rasamoelina, Andrindrasana David
Mach, Marian
Sincak, Peter
[J]. 2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 381 - 386
[49] Enriching Variety of Layer-wise Learning Information by Gradient Combination
Wang, Chien-Yao
Liao, Hong-Yuan Mark
Chen, Ping-Yang
Hsieh, Jun-Wei
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2477 - 2484
[50] Interpreting Convolutional Neural Networks via Layer-Wise Relevance Propagation
Jia, Wohuan
Zhang, Shaoshuai
Jiang, Yue
Xu, Li
[J]. ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT I, 2022, 13338 : 457 - 467

← 1 2 3 4 5 →