Adaptive Natural Gradient Learning Algorithms for Unnormalized Statistical Models

被引:1
|
作者
Karakida, Ryo [1 ]
Okada, Masato [1 ,2 ]
Amari, Shun-ichi [2 ]
机构
[1] Univ Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778561, Japan
[2] RIKEN Brain Sci Inst, 2-1 Hirosawa, Wako, Saitama 3510198, Japan
关键词
Natural gradient; Score matching; Ratio matching; Unnormalized statistical model; Multi-layer neural network;
D O I
10.1007/978-3-319-44778-0_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The natural gradient is a powerful method to improve the transient dynamics of learning by utilizing the geometric structure of the parameter space. Many natural gradient methods have been developed for maximum likelihood learning, which is based on Kullback-Leibler (KL) divergence and its Fisher metric. However, they require the computation of the normalization constant and are not applicable to statistical models with an analytically intractable normalization constant. In this study, we extend the natural gradient framework to divergences for the unnormalized statistical models: score matching and ratio matching. In addition, we derive novel adaptive natural gradient algorithms that do not require computationally demanding inversion of the metric and show their effectiveness in some numerical experiments. In particular, experimental results in a multi-layer neural network model demonstrate that the proposed method can escape from the plateau phenomena much faster than the conventional stochastic gradient descent method.
引用
收藏
页码:427 / 434
页数:8
相关论文
共 50 条
  • [1] Adaptive natural gradient learning algorithms for various stochastic models
    Park, H
    Amari, SI
    Fukumizu, K
    [J]. NEURAL NETWORKS, 2000, 13 (07) : 755 - 764
  • [2] Quickest Change Detection for Unnormalized Statistical Models
    Wu, Suya
    Diao, Enmao
    Banerjee, Taposh
    Ding, Jie
    Tarokh, Vahid
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (02) : 1220 - 1232
  • [3] Natural gradient learning algorithms for decorrelation
    Choi, S
    Amari, S
    Cichocki, A
    [J]. PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 645 - 648
  • [4] Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics
    Gutmann, Michael U.
    Hyvarinen, Aapo
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 307 - 361
  • [5] Adaptive natural gradient learning algorithms for Mackey-Glass chaotic time prediction
    Zhao, Junsheng
    Yu, Xingjiang
    [J]. NEUROCOMPUTING, 2015, 157 : 41 - 45
  • [6] Natural Gradient Learning Algorithms for RBF Networks
    Zhao, Junsheng
    Wei, Haikun
    Zhang, Chi
    Li, Weiling
    Guo, Weili
    Zhang, Kanjian
    [J]. NEURAL COMPUTATION, 2015, 27 (02) : 481 - 505
  • [7] Natural gradient learning algorithms for nonlinear systems
    Zhao Junsheng
    Xia Jianwei
    Zhuang Guangming
    Zhang Huasheng
    [J]. 2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 1979 - 1983
  • [8] Novel on-line adaptive learning algorithms for blind deconvolution using the natural gradient approach
    Amari, S
    Douglas, SC
    Cichocki, A
    Yang, HH
    [J]. (SYSID'97): SYSTEM IDENTIFICATION, VOLS 1-3, 1998, : 1007 - 1012
  • [9] Adaptive Natural Policy Gradient in Reinforcement Learning
    Li, Dazi
    Qiao, Zengyuan
    Song, Tianheng
    Jin, Qibing
    [J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
  • [10] Statistical Inference with Unnormalized Discrete Models and Localized Homogeneous Divergences
    Takenouchi, Takashi
    Kanamori, Takafumi
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 : 1 - 26