Natural gradient descent for on-line learning

被引:71
|
作者
Rattray, M
Saad, D
Amari, S
机构
[1] Univ Manchester, Dept Comp Sci, Manchester M13 9PL, Lancs, England
[2] Aston Univ, Neural Comp Res Grp, Birmingham B4 7ET, W Midlands, England
[3] RIKEN, Brain Sci Inst, Lab Informat Synth, Urawa, Saitama, Japan
关键词
D O I
10.1103/PhysRevLett.81.5461
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Natural gradient descent is an on-line variable-metric optimization algorithm which utilizes an underlying Riemannian parameter space. We analyze the dynamics of natural gradient descent beyond the asymptotic regime by employing an exact statistical mechanics description of learning in two-layer feed-forward neural networks. For a realizable learning scenario we find significant improvements over standard gradient descent for both the transient and asymptotic stages of learning, with a slower power law increase in learning time as task complexity grows. [S0031-9007(98)07950-2].
引用
收藏
页码:5461 / 5464
页数:4
相关论文
共 50 条
  • [1] On-line learning theory of soft committee machines with correlated hidden units - Steepest gradient descent and natural gradient descent
    Inoue, M
    Park, H
    Okada, M
    [J]. JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2003, 72 (04) : 805 - 810
  • [2] Dynamics of on-line gradient descent learning for multilayer neural networks
    Saad, D
    Solla, SA
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 8: PROCEEDINGS OF THE 1995 CONFERENCE, 1996, 8 : 302 - 308
  • [3] Theoretical analysis of batch and on-line training for gradient descent learning in neural networks
    Nakama, Takehiko
    [J]. NEUROCOMPUTING, 2009, 73 (1-3) : 151 - 159
  • [4] Gradient Descent Observer for On-Line Battery Parameter and State Coestimation
    Kruger, Eiko
    Al Shakarchi, Franck
    Quoc Tuan Tran
    [J]. 2016 IEEE/IAS 52ND INDUSTRIAL AND COMMERCIAL POWER SYSTEMS TECHNICAL CONFERENCE (I&CPS), 2016,
  • [5] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [6] The efficiency and the robustness of natural gradient descent learning rule
    Yang, HH
    Amari, S
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 385 - 391
  • [7] On Gradient Descent Training Under Data Augmentation with On-Line Noisy Copies
    Hagiwara, Katsuyuki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (09) : 1537 - 1545
  • [8] Novel on-line adaptive learning algorithms for blind deconvolution using the natural gradient approach
    Amari, S
    Douglas, SC
    Cichocki, A
    Yang, HH
    [J]. (SYSID'97): SYSTEM IDENTIFICATION, VOLS 1-3, 1998, : 1007 - 1012
  • [9] Learning to Learn without Gradient Descent by Gradient Descent
    Chen, Yutian
    Hoffman, Matthew W.
    Colmenarejo, Sergio Gomez
    Denil, Misha
    Lillicrap, Timothy P.
    Botvinick, Matt
    de Freitas, Nando
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [10] Energetic Natural Gradient Descent
    Thomas, Philip S.
    da Silva, Bruno Castro
    Dann, Christoph
    Brunskill, Emma
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48