Energetic Natural Gradient Descent

被引:0
|
作者
Thomas, Philip S. [1 ]
da Silva, Bruno Castro [2 ]
Dann, Christoph [3 ]
Brunskill, Emma [3 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
[2] Univ Fed Rio Grande do Sul, Porto Alegre, RS, Brazil
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Convergent Stochastic Almost Natural Gradient Descent
    Sanchez-Lopez, Borja
    Cerquides, Jesus
    [J]. ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 54 - 63
  • [2] Projective Fisher Information for Natural Gradient Descent
    Kaul, Piyush
    Lall, Brejesh
    [J]. IEEE Transactions on Artificial Intelligence, 2023, 4 (02): : 304 - 314
  • [3] Natural gradient descent for on-line learning
    Rattray, M
    Saad, D
    Amari, S
    [J]. PHYSICAL REVIEW LETTERS, 1998, 81 (24) : 5461 - 5464
  • [4] Stochastic Natural Gradient Descent by Estimation of Empirical Covariances
    Luigi, Malago
    Matteo, Matteucci
    Giovanni, Pistone
    [J]. 2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, : 949 - 956
  • [5] Analysis of natural gradient descent for multilayer neural networks
    Rattray, M
    Saad, D
    [J]. PHYSICAL REVIEW E, 1999, 59 (04): : 4523 - 4532
  • [6] Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
    Kunstner, Frederik
    Balles, Lukas
    Hennig, Philipp
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] Optimization of Graph Neural Networks with Natural Gradient Descent
    Izadi, Mohammad Rasool
    Fang, Yihao
    Stevenson, Robert
    Lin, Lizhen
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 171 - 179
  • [8] The efficiency and the robustness of natural gradient descent learning rule
    Yang, HH
    Amari, S
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 385 - 391
  • [9] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [10] Learning to Learn without Gradient Descent by Gradient Descent
    Chen, Yutian
    Hoffman, Matthew W.
    Colmenarejo, Sergio Gomez
    Denil, Misha
    Lillicrap, Timothy P.
    Botvinick, Matt
    de Freitas, Nando
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70