Adaptive Natural Gradient Learning Algorithms for Unnormalized Statistical Models

被引：1

作者：

Karakida, Ryo ^{[1
]}

Okada, Masato ^{[1
,2
]}

Amari, Shun-ichi ^{[2
]}

机构：

[1] Univ Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778561, Japan

[2] RIKEN Brain Sci Inst, 2-1 Hirosawa, Wako, Saitama 3510198, Japan

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT I | 2016年 / 9886卷

关键词：

Natural gradient; Score matching; Ratio matching; Unnormalized statistical model; Multi-layer neural network;

D O I：

10.1007/978-3-319-44778-0_50

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The natural gradient is a powerful method to improve the transient dynamics of learning by utilizing the geometric structure of the parameter space. Many natural gradient methods have been developed for maximum likelihood learning, which is based on Kullback-Leibler (KL) divergence and its Fisher metric. However, they require the computation of the normalization constant and are not applicable to statistical models with an analytically intractable normalization constant. In this study, we extend the natural gradient framework to divergences for the unnormalized statistical models: score matching and ratio matching. In addition, we derive novel adaptive natural gradient algorithms that do not require computationally demanding inversion of the metric and show their effectiveness in some numerical experiments. In particular, experimental results in a multi-layer neural network model demonstrate that the proposed method can escape from the plateau phenomena much faster than the conventional stochastic gradient descent method.

引用

页码：427 / 434

页数：8

共 50 条

[21] Adaptive method of realizing natural gradient learning for multilayer perceptrons
Amari, S
Park, H
Fukumizu, K
[J]. NEURAL COMPUTATION, 2000, 12 (06) : 1399 - 1409
[22] Langevin dynamics for adaptive inverse reinforcement learning of stochastic gradient algorithms
Krishnamurthy, Vikram
Yin, George
[J]. Journal of Machine Learning Research, 2021, 22
[23] Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms
Krishnamurthy, Vikram
Yin, George
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22 : 1 - 49
[24] Local adaptive learning algorithms for blind separation of natural images
Cichocki, Andrzej
Kasprzak, Wlodzimierz
[J]. Neural Network World, 1996, 6 (04): : 515 - 523
[25] Statistical efficiency of adaptive algorithms
Widrow, B
Kamenetsky, M
[J]. NEURAL NETWORKS, 2003, 16 (5-6) : 735 - 744
[26] Natural gradient learning neural networks for adaptive inversion of Hammerstein systems
Ibnkahla, M
[J]. IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (10) : 315 - 317
[27] Analysis and Synthesis of Adaptive Gradient Algorithms in Machine Learning: The Case of AdaBound and MAdamSSM
Chakrabarti, Kushal
Chopra, Nikhil
[J]. 2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 795 - 800
[28] New Sparse Adaptive Algorithms Based on the Natural Gradient and the L0-Norm
Pelekanakis, Konstantinos
Chitre, Mandar
[J]. IEEE JOURNAL OF OCEANIC ENGINEERING, 2013, 38 (02) : 323 - 332
[29] Comparison of four gradient-learning algorithms for neural network Wiener models
Janczak, A
[J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2003, 34 (01) : 21 - 35
[30] Relative Fisher Information and Natural Gradient for Learning Large Modular Models
Sun, Ke
Nielsen, Frank
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70

← 1 2 3 4 5 →