Self-scaled conjugate gradient training algorithms

被引:19
|
作者
Kostopoulos, A. E. [1 ]
Grapsa, T. N. [1 ]
机构
[1] Univ Patras, Dept Math, GR-26504 Patras, Greece
关键词
Neural network; Training; Self-scaled conjugate gradient; Perry's method; Line search; LEARNING ALGORITHMS; RESTART PROCEDURES; CONVERGENCE;
D O I
10.1016/j.neucom.2009.04.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents some efficient training algorithms, based on conjugate gradient optimization methods. In addition to the existing conjugate gradient training algorithms, we introduce Perry's conjugate gradient method as a training algorithm [A. Perry, A modified conjugate gradient algorithm, Operations Research 26 (1978) 26-43]. Perry's method has been proven to be a very efficient method in the context of unconstrained optimization, but it has never been used in MLP training. Furthermore, a new class of conjugate gradient (CG) methods is proposed, called self-scaled CG methods, which are derived from the principles of Hestenes-Stiefel, Fletcher-Reeves, Polak-Ribiere and Perry's method. This class is based on the spectral scaling parameter introduced in [J. Barzilai, J.M. Borwein, Two point step size gradient methods, IMA journal of Numerical Analysis 8 (1988) 141-148]. The spectral scaling parameter contains second order information without estimating the Hessian matrix. Furthermore, we incorporate to the CG training algorithms an efficient line search technique based on the Wolfe conditions and on safeguarded cubic interpolation [D.F. Shanno, K.H. Phua, Minimization of unconstrained multivariate functions, ACM Transactions on Mathematical Software 2 (1976) 87-94]. In addition, the initial learning rate parameter, fed to the line search technique, was automatically adapted at each iteration by a closed formula proposed in [D.F. Shanno, K.H. Phua, Minimization of unconstrained multivariate functions, ACM Transactions on Mathematical Software 2 (1976) 87-94; D.G. Sotiropoulos, A.E. Kostopoulos, T.N. Grapsa, A spectral version of Perry's conjugate gradient method for neural network training, in: D.T. Tsahalis (Ed.), Fourth GRACM Congress on Computational Mechanics, vol. 1, 2002, pp. 172-179]. Finally, an efficient restarting procedure was employed in order to further improve the effectiveness of the CG training algorithms. Experimental results show that, in general, the new class of methods can perform better with a much lower computational cost and better success performance. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:3000 / 3019
页数:20
相关论文
共 50 条
  • [31] Scaled conjugate gradient method for radar pulse modulation estimation
    Lunden, Jarmo
    Koivunen, Visa
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 297 - +
  • [32] Symbolic rule extraction with a scaled conjugate gradient version of CLARION
    Falas, T
    Stafylopatis, A
    [J]. Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 845 - 848
  • [33] A scaled BFGS preconditioned conjugate gradient algorithm for unconstrained optimization
    Andrei, Neculai
    [J]. APPLIED MATHEMATICS LETTERS, 2007, 20 (06) : 645 - 650
  • [34] A SCALED CONJUGATE-GRADIENT ALGORITHM FOR FAST SUPERVISED LEARNING
    MOLLER, MF
    [J]. NEURAL NETWORKS, 1993, 6 (04) : 525 - 533
  • [35] A New Scaled Secant-Type Conjugate Gradient Algorithm
    Moghrabi, Issam A. R.
    [J]. 2017 EUROPEAN CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (EECS), 2017, : 96 - 100
  • [36] Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank
    Hamza Fawzi
    Pablo A. Parrilo
    [J]. Mathematical Programming, 2016, 158 : 417 - 465
  • [37] Scaled Conjugate Gradient Artificial Neural Network-Based Ripple Current Correlation MPPT Algorithms for PV System
    Noman, Abdullah M.
    Khan, Hamed
    Sher, Hadeed Ahmed
    Almutairi, Sulaiman Z.
    Alqahtani, Mohammed H.
    Aljumah, Ali S.
    [J]. INTERNATIONAL JOURNAL OF PHOTOENERGY, 2023, 2023
  • [38] Forecasting the Indian Stock Market by Applying the Levenberg-Marquardt and Scaled Conjugate Training Algorithms in Neural Networks
    Alfonso Perez, Gerardo
    Ramirez, D. R.
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE), 2017, 190 : 681 - 686
  • [39] Conjugate gradient algorithms for minor subspace analysis
    Badeau, Roland
    David, Bertrand
    Richard, Gael
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PTS 1-3, PROCEEDINGS, 2007, : 1013 - +
  • [40] Block conjugate gradient algorithms for adaptive filtering
    Lim, JS
    Un, CK
    [J]. SIGNAL PROCESSING, 1996, 55 (01) : 65 - 77