Natural gradient works efficiently in learning

被引：1806

作者：

Amari, S ^{[1
]}

机构：

[1] RIKEN, Frontier Res Program, Wako, Saitama 35101, Japan

来源：

NEURAL COMPUTATION | 1998年 / 10卷 / 02期

关键词：

D O I：

10.1162/089976698300017746

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.

引用

页码：251 / 276

页数：26

共 50 条

[21] On-line profit sharing works efficiently
Matsui, T
Inuzuka, N
Seki, H
[J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2003, 2773 : 317 - 324
[22] Smart NC works faster, more efficiently
不详
[J]. MACHINE DESIGN, 1998, 70 (06) : 108 - 108
[23] Learning to Efficiently Rank
Wang, Lidan
Lin, Jimmy
Metzler, Donald
[J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 138 - 145
[24] Learning to Drive (Efficiently)
Dominka, Sven
Doppler, Joerg
Smith, Henrik
Litschauer, Teresa
Laflamme, Catherine
[J]. 2024 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY, EIT 2024, 2024, : 111 - 116
[25] Cooling with natural refrigerants efficiently
不详
[J]. FLEISCHWIRTSCHAFT, 2012, 92 (04): : 64 - 65
[26] Quantum Statistical Learning via Quantum Wasserstein Natural Gradient
Simon Becker
Wuchen Li
[J]. Journal of Statistical Physics, 2021, 182
[27] Fisher Information and Natural Gradient Learning in Random Deep Networks
Amari, Shun-ichi
Karakida, Ryo
Oizumi, Masafumi
[J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 694 - 702
[28] Quantum Statistical Learning via Quantum Wasserstein Natural Gradient
Becker, Simon
Li, Wuchen
[J]. JOURNAL OF STATISTICAL PHYSICS, 2021, 182 (01)
[29] Natural Gradient Primal-Dual Method for Decentralized Learning
Niwa, Kenta
Ishii, Hiro
Sawada, Hiroshi
Fujino, Akinori
Harada, Noboru
Yokota, Rio
[J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2024, 10 (417-433): : 417 - 433
[30] Adaptive natural gradient learning algorithms for various stochastic models
Park, H
Amari, SI
Fukumizu, K
[J]. NEURAL NETWORKS, 2000, 13 (07) : 755 - 764

← 1 2 3 4 5 →