Natural gradient works efficiently in learning

被引:1806
|
作者
Amari, S [1 ]
机构
[1] RIKEN, Frontier Res Program, Wako, Saitama 35101, Japan
关键词
D O I
10.1162/089976698300017746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.
引用
收藏
页码:251 / 276
页数:26
相关论文
共 50 条
  • [21] On-line profit sharing works efficiently
    Matsui, T
    Inuzuka, N
    Seki, H
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2003, 2773 : 317 - 324
  • [22] Smart NC works faster, more efficiently
    不详
    [J]. MACHINE DESIGN, 1998, 70 (06) : 108 - 108
  • [23] Learning to Efficiently Rank
    Wang, Lidan
    Lin, Jimmy
    Metzler, Donald
    [J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 138 - 145
  • [24] Learning to Drive (Efficiently)
    Dominka, Sven
    Doppler, Joerg
    Smith, Henrik
    Litschauer, Teresa
    Laflamme, Catherine
    [J]. 2024 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY, EIT 2024, 2024, : 111 - 116
  • [25] Cooling with natural refrigerants efficiently
    不详
    [J]. FLEISCHWIRTSCHAFT, 2012, 92 (04): : 64 - 65
  • [26] Quantum Statistical Learning via Quantum Wasserstein Natural Gradient
    Simon Becker
    Wuchen Li
    [J]. Journal of Statistical Physics, 2021, 182
  • [27] Fisher Information and Natural Gradient Learning in Random Deep Networks
    Amari, Shun-ichi
    Karakida, Ryo
    Oizumi, Masafumi
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 694 - 702
  • [28] Quantum Statistical Learning via Quantum Wasserstein Natural Gradient
    Becker, Simon
    Li, Wuchen
    [J]. JOURNAL OF STATISTICAL PHYSICS, 2021, 182 (01)
  • [29] Natural Gradient Primal-Dual Method for Decentralized Learning
    Niwa, Kenta
    Ishii, Hiro
    Sawada, Hiroshi
    Fujino, Akinori
    Harada, Noboru
    Yokota, Rio
    [J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2024, 10 (417-433): : 417 - 433
  • [30] Adaptive natural gradient learning algorithms for various stochastic models
    Park, H
    Amari, SI
    Fukumizu, K
    [J]. NEURAL NETWORKS, 2000, 13 (07) : 755 - 764