Learning to Learn without Gradient Descent by Gradient Descent

被引:0
|
作者
Chen, Yutian [1 ]
Hoffman, Matthew W. [1 ]
Colmenarejo, Sergio Gomez [1 ]
Denil, Misha [1 ]
Lillicrap, Timothy P. [1 ]
Botvinick, Matt [1 ]
de Freitas, Nando [1 ]
机构
[1] DeepMind, London, England
关键词
NETWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] Learning to Learn Gradient Aggregation by Gradient Descent
    Ji, Jinlong
    Chen, Xuhui
    Wang, Qianlong
    Yu, Lixing
    Li, Pan
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2614 - 2620
  • [3] Learning to learn using gradient descent
    Hochreiter, S
    Younger, AS
    Conwell, PR
    [J]. ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 87 - 94
  • [4] Stochastic Gradient Descent for Nonconvex Learning Without Bounded Gradient Assumptions
    Lei, Yunwen
    Hu, Ting
    Li, Guiying
    Tang, Ke
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 4394 - 4400
  • [5] Stein Variational Gradient Descent Without Gradient
    Han, Jun
    Liu, Qiang
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [6] Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
    Denevi, Giulia
    Ciliberto, Carlo
    Grazzi, Riccardo
    Pontil, Massimiliano
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Gradient learning in a classification setting by gradient descent
    Cai, Jia
    Wang, Hongyan
    Zhou, Ding-Xuan
    [J]. JOURNAL OF APPROXIMATION THEORY, 2009, 161 (02) : 674 - 692
  • [8] Learning Fractals by Gradient Descent
    Tu, Cheng-Hao
    Chen, Hong-You
    Carlyn, David
    Chao, Wei-Lun
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2456 - 2464
  • [9] Gradient Descent Learning With Floats
    Sun, Tao
    Tang, Ke
    Li, Dongsheng
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (03) : 1763 - 1771
  • [10] LEARNING BY ONLINE GRADIENT DESCENT
    BIEHL, M
    SCHWARZE, H
    [J]. JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1995, 28 (03): : 643 - 656