Natural conjugate gradient training of multilayer perceptrons

被引：0

作者：

Gonzalez, Ana ^{[1
]}

Dorronsoro, Jose R.

机构：

[1] Univ Autonoma Madrid, Dpto Ingn Informat, E-28049 Madrid, Spain

[2] Univ Autonoma Madrid, Inst Ingn Conocimiento, E-28049 Madrid, Spain

来源：

ARTIFICIAL NEURAL NETWORKS - ICANN 2006, PT 1 | 2006年 / 4131卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For maximum log-likelihood estimation, the Fisher matrix defines a Riemannian metric in weight space and, as shown by Amari and his coworkers, the resulting natural gradient greatly accelerates on-line multilayer perceptron (MLP) training. While its batch gradient descent counterpart also improves on standard gradient descent (as it gives a Gauss-Newton approximation to mean square error minimization), it may no longer be competitive with more advanced gradient-based function minimization procedures. In this work we shall show how to introduce natural gradients in a conjugate gradient (CG) setting, showing numerically that when applied to batch MLP learning, they lead to faster convergence to better minima than that achieved by standard euclidean CG descent. Since a drawback of full natural gradient is its larger computational cost, we also consider some cost simplifying variants and show that one of them, diagonal natural CG, also gives better minima than standard CG, with a comparable complexity.

引用

页码：169 / 177

页数：9

共 50 条

[1] Natural conjugate gradient training of multilayer perceptrons
Gonzalez, Ana
Dorronsoro, Jose R.
[J]. NEUROCOMPUTING, 2008, 71 (13-15) : 2499 - 2506
[2] A note on conjugate natural gradient training of multilayer perceptrons
Gonzalez, Ana
Dorronsoro, Jose R.
[J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 887 - +
[3] Complexity issues in natural gradient descent method for training multilayer perceptrons
Yang, HH
Amari, S
[J]. NEURAL COMPUTATION, 1998, 10 (08) : 2137 - 2157
[4] An improvement to the natural gradient learning algorithm for multilayer perceptrons
Bastian, MR
Gunther, JH
Moon, TK
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 313 - 316
[5] Adaptive method of realizing natural gradient learning for multilayer perceptrons
Amari, S
Park, H
Fukumizu, K
[J]. NEURAL COMPUTATION, 2000, 12 (06) : 1399 - 1409
[6] An Adaptive Natural Gradient Method with Adaptive Step Size in Multilayer Perceptrons
Guo, Weili
Wei, Haikun
Liu, Tianhong
Song, Aiguo
Zhang, Kanjian
[J]. 2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 1593 - 1597
[7] Fast training of multilayer perceptrons
Verma, B
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (06): : 1314 - 1320
[8] Training multi-layer perceptrons by natural gradient descent
Yang, HH
Amari, S
[J]. PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 211 - 214
[9] An adaptive method of training multilayer perceptrons
Lo, JT
Bassu, D
[J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2013 - 2018
[10] Robust formulations for training multilayer perceptrons
Kärkkäinen, T
Heikkola, E
[J]. NEURAL COMPUTATION, 2004, 16 (04) : 837 - 862

← 1 2 3 4 5 →