Neural Network Training With Levenberg-Marquardt and Adaptable Weight Compression

被引:68
|
作者
Smith, James S. [1 ]
Wu, Bo [1 ,2 ]
Wilamowski, Bogdan M. [1 ,3 ]
机构
[1] Auburn Univ, Dept Elect & Comp Engn, Auburn, AL 36849 USA
[2] Jinan Univ, Big Data Decis Inst, Guangzhou 510632, Guangdong, Peoples R China
[3] Univ IT & Management, PL-35225 Rzeszow, Poland
关键词
Diminishing gradient; Levenberg-Marquardt (LM) algorithm; neural network training; weight compression;
D O I
10.1109/TNNLS.2018.2846775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Difficult experiments in training neural networks often fail to converge due to what is known as the flatspot problem, where the gradient of hidden neurons in the network diminishes in value, rending the weight update process ineffective. Whereas a first-order algorithm can address this issue by learning parameters to normalize neuron activations, the second-order algorithms cannot afford additional parameters given that they include a large Jacobian matrix calculation. This paper proposes Levenberg-Marquardt with weight compression (LM-WC), which combats the flat-spot problem by compressing neuron weights to push neuron activation out of the saturated region and close to the linear region. The presented algorithm requires no additional learned parameters and contains an adaptable compression parameter, which is adjusted to avoid training failure and increase the probability of neural network convergence. Several experiments are presented and discussed to demonstrate the success of LM-WC against standard LM and LMwith random restarts on benchmark data sets for varying network architectures. Our results suggest that the LM-WC algorithm can improve training success by 10 times or more compared with other methods.
引用
收藏
页码:580 / 587
页数:8
相关论文
共 50 条
  • [1] Neighborhood based Levenberg-Marquardt algorithm for neural network training
    Lera, G
    Pinzolas, M
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (05): : 1200 - 1203
  • [2] Backpropagation and Levenberg-Marquardt Algorithm for Training Finite Element Neural Network
    Reynaldi, Arnold
    Lukas, Samuel
    Margaretha, Helena
    [J]. 2012 SIXTH UKSIM/AMSS EUROPEAN SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS), 2012, : 89 - 94
  • [3] Variable projection method and levenberg-marquardt algorithm for neural network training
    Kim, Cheol-Taek
    Lee, Ju-Jang
    Kim, Hyejin
    [J]. IECON 2006 - 32ND ANNUAL CONFERENCE ON IEEE INDUSTRIAL ELECTRONICS, VOLS 1-11, 2006, : 2084 - +
  • [4] A quasi-local Levenberg-Marquardt algorithm for neural network training
    Lera, G
    Pinzolas, M
    [J]. IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2242 - 2246
  • [5] Modified Levenberg-Marquardt Method for Neural Networks Training
    Suratgar, Amir Abolfazl
    Tavakoli, Mohammad Bagher
    Hoseinabadi, Abbas
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 6, 2005, : 46 - 48
  • [6] Levenberg-Marquardt Training Algorithms for Random Neural Networks
    Basterrech, Sebastian
    Mohammed, Samir
    Rubino, Gerardo
    Soliman, Mostafa
    [J]. COMPUTER JOURNAL, 2011, 54 (01): : 125 - 135
  • [7] A Novel Modification on the Levenberg-Marquardt Algorithm for Avoiding Overfitting in Neural Network Training
    Iplikci, Serdar
    Bilgi, Batuhan
    Menemen, Ali
    Bahtiyar, Bedri
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 201 - 207
  • [8] Stability Analysis of the Modified Levenberg-Marquardt Algorithm for the Artificial Neural Network Training
    Rubio, Jose de Jesus
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (08) : 3510 - 3524
  • [9] Applying Bayesian Regularization for Acceleration of Levenberg-Marquardt based Neural Network Training
    Suliman, Azizah
    Omarov, Batyrkhan S.
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2018, 5 (01): : 68 - 72
  • [10] Performance of the Levenberg-Marquardt neural network training method in electronic nose applications
    Kermani, BG
    Schiffman, SS
    Nagle, HT
    [J]. SENSORS AND ACTUATORS B-CHEMICAL, 2005, 110 (01) : 13 - 22