The efficiency and the robustness of natural gradient descent learning rule

被引:0
|
作者
Yang, HH [1 ]
Amari, S [1 ]
机构
[1] Oregon Grad Inst, Dept Comp Sci, Portland, OR 97291 USA
关键词
D O I
暂无
中图分类号
B84 [心理学]; C [社会科学总论]; Q98 [人类学];
学科分类号
03 ; 0303 ; 030303 ; 04 ; 0402 ;
摘要
The inverse of the Fisher information matrix is used in the natural gradient descent algorithm to train single-layer and multi-layer perceptrons. We have discovered a new scheme to represent the Fisher information matrix of a stochastic multi-layer perceptron. Based on this scheme, we have designed an algorithm to compute the natural gradient. When the input dimension it is much larger than the number of hidden neurons, the complexity of this algorithm is of order O(n). It is confirmed by simulations that the natural gradient descent learning rule is not only efficient but also robust.
引用
收藏
页码:385 / 391
页数:7
相关论文
共 50 条
  • [1] Natural gradient descent for on-line learning
    Rattray, M
    Saad, D
    Amari, S
    [J]. PHYSICAL REVIEW LETTERS, 1998, 81 (24) : 5461 - 5464
  • [2] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [3] A New Rule-weight Learning Method based on Gradient Descent
    Fakhrahmad, S. M.
    Jahromi, M. Zolghadri
    [J]. WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, 2009, : 63 - +
  • [4] Learning to Learn without Gradient Descent by Gradient Descent
    Chen, Yutian
    Hoffman, Matthew W.
    Colmenarejo, Sergio Gomez
    Denil, Misha
    Lillicrap, Timothy P.
    Botvinick, Matt
    de Freitas, Nando
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [5] Energetic Natural Gradient Descent
    Thomas, Philip S.
    da Silva, Bruno Castro
    Dann, Christoph
    Brunskill, Emma
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [6] On-line learning theory of soft committee machines with correlated hidden units - Steepest gradient descent and natural gradient descent
    Inoue, M
    Park, H
    Okada, M
    [J]. JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2003, 72 (04) : 805 - 810
  • [7] Learning Fractals by Gradient Descent
    Tu, Cheng-Hao
    Chen, Hong-You
    Carlyn, David
    Chao, Wei-Lun
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2456 - 2464
  • [8] Gradient Descent Learning With Floats
    Sun, Tao
    Tang, Ke
    Li, Dongsheng
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (03) : 1763 - 1771
  • [9] LEARNING BY ONLINE GRADIENT DESCENT
    BIEHL, M
    SCHWARZE, H
    [J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1995, 28 (03): : 643 - 656
  • [10] A Limitation of Gradient Descent Learning
    Sum, John
    Leung, Chi-Sing
    Ho, Kevin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (06) : 2227 - 2232