Forward and Reverse Gradient-Based Hyperparameter Optimization

被引：0

作者：

Franceschi, Luca ^{[1
,2
]}

Donini, Michele ^{[1
]}

Frasconi, Paolo ^{[3
]}

Pontil, Massimiliano ^{[1
,2
]}

机构：

[1] Ist Italiano Tecnol, Computat Stat & Machine Learning, Genoa, Italy

[2] UCL, Dept Comp Sci, London, England

[3] Univ Firenze, Dept Informat Engn, Florence, Italy

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent. These procedures mirror two methods of computing gradients for recurrent neural networks and have different trade-offs in terms of running time and space requirements. Our formulation of the reverse-mode procedure is linked to previous work by Maclaurin et al. (2015) but does not require reversible dynamics The forward-mode procedure is suitable for real-time hyperparameter updates, which may significantly speed up hyperparameter optimization on large datasets. We present experiments on data cleaning and on learning task interactions. We also present one large-scale experiment where the use of previous gradient-based methods would be prohibitive.

引用

页数：9

共 50 条

[1] Gradient-based Hyperparameter Optimization Over Long Horizons
Micaelli, Paul
Storkey, Amos
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[2] Gradient-based Hyperparameter Optimization through Reversible Learning
Maclaurin, Dougal
Duvenaud, David
Adams, Ryan P.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2113 - 2122
[3] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
O. Y. Bakhteev
V. V. Strijov
[J]. Annals of Operations Research, 2020, 289 : 51 - 65
[4] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
Bakhteev, O. Y.
Strijov, V. V.
[J]. ANNALS OF OPERATIONS RESEARCH, 2020, 289 (01) : 51 - 65
[5] EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization
Bohdal, Ondrej
Yang, Yongxin
Hospedales, Timothy
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[6] Dynamics of gradient-based learning and applications to hyperparameter estimation
Wong, KYM
Luo, PX
Li, FL
[J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 369 - 376
[7] Gradient-based optimization of hyperparameters
Bengio, Y
[J]. NEURAL COMPUTATION, 2000, 12 (08) : 1889 - 1900
[8] Gradient-based simulation optimization
Kim, Sujin
[J]. PROCEEDINGS OF THE 2006 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2006, : 159 - 167
[9] Gradient-based learning and optimization
Cao, XR
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2003, : 3 - 7
[10] Reverse shape compensation via a gradient-based moving particle optimization method
Deng, Hao
To, Albert C.
[J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 377

← 1 2 3 4 5 →