Neural Network Training With Levenberg-Marquardt and Adaptable Weight Compression

被引:69
|
作者
Smith, James S. [1 ]
Wu, Bo [1 ,2 ]
Wilamowski, Bogdan M. [1 ,3 ]
机构
[1] Auburn Univ, Dept Elect & Comp Engn, Auburn, AL 36849 USA
[2] Jinan Univ, Big Data Decis Inst, Guangzhou 510632, Guangdong, Peoples R China
[3] Univ IT & Management, PL-35225 Rzeszow, Poland
关键词
Diminishing gradient; Levenberg-Marquardt (LM) algorithm; neural network training; weight compression;
D O I
10.1109/TNNLS.2018.2846775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Difficult experiments in training neural networks often fail to converge due to what is known as the flatspot problem, where the gradient of hidden neurons in the network diminishes in value, rending the weight update process ineffective. Whereas a first-order algorithm can address this issue by learning parameters to normalize neuron activations, the second-order algorithms cannot afford additional parameters given that they include a large Jacobian matrix calculation. This paper proposes Levenberg-Marquardt with weight compression (LM-WC), which combats the flat-spot problem by compressing neuron weights to push neuron activation out of the saturated region and close to the linear region. The presented algorithm requires no additional learned parameters and contains an adaptable compression parameter, which is adjusted to avoid training failure and increase the probability of neural network convergence. Several experiments are presented and discussed to demonstrate the success of LM-WC against standard LM and LMwith random restarts on benchmark data sets for varying network architectures. Our results suggest that the LM-WC algorithm can improve training success by 10 times or more compared with other methods.
引用
收藏
页码:580 / 587
页数:8
相关论文
共 50 条
  • [31] Lateral control of autonomous vehicle using levenberg-marquardt neural network algorithm
    Lee, K.B.
    Kim, Y.J.
    Ahn, O.S.
    Kim, Y.B.
    International Journal of Automotive Technology, 2002, 3 (02) : 79 - 88
  • [32] An Improved Levenberg-Marquardt Algorithm with Adaptive Learning Rate for RBF Neural Network
    An Ru
    Li Wen Jing
    Han Hong Gui
    Qiao Jun Fei
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 3630 - 3635
  • [33] Artificial Neural Network Channel Estimation Based on Levenberg-Marquardt for OFDM Systems
    Cebrail Çiflikli
    A. Tuncay Özşahin
    A. Çağrı Yapici
    Wireless Personal Communications, 2009, 51 : 221 - 229
  • [34] Parallel and separable recursive Levenberg-Marquardt training algorithm
    Asirvadam, VS
    McLoone, SF
    Irwin, GW
    NEURAL NETWORKS FOR SIGNAL PROCESSING XII, PROCEEDINGS, 2002, : 129 - 138
  • [35] Application of the Neural Network in Diagnosis of Breast Cancer Based on Levenberg-Marquardt Algorithm
    Min, Zeng
    Xiao, Liang
    Cao, Lin
    Yan, Hangcheng
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 268 - 272
  • [36] A Parallel Levenberg-Marquardt Algorithm for Recursive Neural Network in a Robot Control System
    Wang, Wei
    Pu, Yunming
    Li, Wang
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2018, 12 (02) : 32 - 47
  • [37] A Graph Neural Network Approach with Improved Levenberg-Marquardt for Electrical Impedance Tomography
    Zhao, Ruwen
    Xu, Chuanpei
    Zhu, Zhibin
    Mo, Wei
    APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [38] Artificial Neural Network Channel Estimation Based on Levenberg-Marquardt for OFDM Systems
    Ciflikli, Cebrail
    Ozsahin, A. Tuncay
    Yapici, A. Cagri
    WIRELESS PERSONAL COMMUNICATIONS, 2009, 51 (02) : 221 - 229
  • [39] Neural network predictive control for dissolved oxygen based on levenberg-marquardt algorithm
    Li M.
    Zhou L.
    Wang J.
    Nongye Jixie Xuebao, 6 (297-302): : 297 - 302
  • [40] Adding Nonlinear System Dynamics to Levenberg-Marquardt Algorithm for Neural Network Control
    Larrea, Mikel
    Irigoyen, Eloy
    Gomez, Vicente
    ARTIFICIAL NEURAL NETWORKS (ICANN 2010), PT III, 2010, 6354 : 352 - 357