Natural conjugate gradient training of multilayer perceptrons

被引:0
|
作者
Gonzalez, Ana [1 ]
Dorronsoro, Jose R.
机构
[1] Univ Autonoma Madrid, Dpto Ingn Informat, E-28049 Madrid, Spain
[2] Univ Autonoma Madrid, Inst Ingn Conocimiento, E-28049 Madrid, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For maximum log-likelihood estimation, the Fisher matrix defines a Riemannian metric in weight space and, as shown by Amari and his coworkers, the resulting natural gradient greatly accelerates on-line multilayer perceptron (MLP) training. While its batch gradient descent counterpart also improves on standard gradient descent (as it gives a Gauss-Newton approximation to mean square error minimization), it may no longer be competitive with more advanced gradient-based function minimization procedures. In this work we shall show how to introduce natural gradients in a conjugate gradient (CG) setting, showing numerically that when applied to batch MLP learning, they lead to faster convergence to better minima than that achieved by standard euclidean CG descent. Since a drawback of full natural gradient is its larger computational cost, we also consider some cost simplifying variants and show that one of them, diagonal natural CG, also gives better minima than standard CG, with a comparable complexity.
引用
收藏
页码:169 / 177
页数:9
相关论文
共 50 条
  • [1] Natural conjugate gradient training of multilayer perceptrons
    Gonzalez, Ana
    Dorronsoro, Jose R.
    [J]. NEUROCOMPUTING, 2008, 71 (13-15) : 2499 - 2506
  • [2] A note on conjugate natural gradient training of multilayer perceptrons
    Gonzalez, Ana
    Dorronsoro, Jose R.
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 887 - +
  • [3] Complexity issues in natural gradient descent method for training multilayer perceptrons
    Yang, HH
    Amari, S
    [J]. NEURAL COMPUTATION, 1998, 10 (08) : 2137 - 2157
  • [4] An improvement to the natural gradient learning algorithm for multilayer perceptrons
    Bastian, MR
    Gunther, JH
    Moon, TK
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 313 - 316
  • [5] Adaptive method of realizing natural gradient learning for multilayer perceptrons
    Amari, S
    Park, H
    Fukumizu, K
    [J]. NEURAL COMPUTATION, 2000, 12 (06) : 1399 - 1409
  • [6] An Adaptive Natural Gradient Method with Adaptive Step Size in Multilayer Perceptrons
    Guo, Weili
    Wei, Haikun
    Liu, Tianhong
    Song, Aiguo
    Zhang, Kanjian
    [J]. 2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 1593 - 1597
  • [7] Fast training of multilayer perceptrons
    Verma, B
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (06): : 1314 - 1320
  • [8] Training multi-layer perceptrons by natural gradient descent
    Yang, HH
    Amari, S
    [J]. PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 211 - 214
  • [9] An adaptive method of training multilayer perceptrons
    Lo, JT
    Bassu, D
    [J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2013 - 2018
  • [10] Robust formulations for training multilayer perceptrons
    Kärkkäinen, T
    Heikkola, E
    [J]. NEURAL COMPUTATION, 2004, 16 (04) : 837 - 862