Forward and Reverse Gradient-Based Hyperparameter Optimization

被引:0
|
作者
Franceschi, Luca [1 ,2 ]
Donini, Michele [1 ]
Frasconi, Paolo [3 ]
Pontil, Massimiliano [1 ,2 ]
机构
[1] Ist Italiano Tecnol, Computat Stat & Machine Learning, Genoa, Italy
[2] UCL, Dept Comp Sci, London, England
[3] Univ Firenze, Dept Informat Engn, Florence, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent. These procedures mirror two methods of computing gradients for recurrent neural networks and have different trade-offs in terms of running time and space requirements. Our formulation of the reverse-mode procedure is linked to previous work by Maclaurin et al. (2015) but does not require reversible dynamics The forward-mode procedure is suitable for real-time hyperparameter updates, which may significantly speed up hyperparameter optimization on large datasets. We present experiments on data cleaning and on learning task interactions. We also present one large-scale experiment where the use of previous gradient-based methods would be prohibitive.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Gradient-based Hyperparameter Optimization Over Long Horizons
    Micaelli, Paul
    Storkey, Amos
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Gradient-based Hyperparameter Optimization through Reversible Learning
    Maclaurin, Dougal
    Duvenaud, David
    Adams, Ryan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2113 - 2122
  • [3] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
    O. Y. Bakhteev
    V. V. Strijov
    [J]. Annals of Operations Research, 2020, 289 : 51 - 65
  • [4] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
    Bakhteev, O. Y.
    Strijov, V. V.
    [J]. ANNALS OF OPERATIONS RESEARCH, 2020, 289 (01) : 51 - 65
  • [5] EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization
    Bohdal, Ondrej
    Yang, Yongxin
    Hospedales, Timothy
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Dynamics of gradient-based learning and applications to hyperparameter estimation
    Wong, KYM
    Luo, PX
    Li, FL
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 369 - 376
  • [7] Gradient-based optimization of hyperparameters
    Bengio, Y
    [J]. NEURAL COMPUTATION, 2000, 12 (08) : 1889 - 1900
  • [8] Gradient-based simulation optimization
    Kim, Sujin
    [J]. PROCEEDINGS OF THE 2006 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2006, : 159 - 167
  • [9] Gradient-based learning and optimization
    Cao, XR
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2003, : 3 - 7
  • [10] Reverse shape compensation via a gradient-based moving particle optimization method
    Deng, Hao
    To, Albert C.
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 377