On the Iteration Complexity of Hypergradient Computation

被引:0
|
作者
Grazzi, Riccardo [1 ,2 ]
Franceschi, Luca [1 ,2 ]
Pontil, Massimiliano [1 ,2 ]
Salzo, Saverio [1 ]
机构
[1] Ist Italiano Tecnol, Computat Stat & Machine Learning, Genoa, Italy
[2] UCL, Dept Comp Sci, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation. Important instances arising in machine learning include hyperparameter optimization, meta-learning, and certain graph and recurrent neural networks. Typically the gradient of the upper-level objective (hypergradient) is hard or even impossible to compute exactly, which has raised the interest in approximation methods. We investigate some popular approaches to compute the hypergradient, based on reverse mode iterative differentiation and approximate implicit differentiation. Under the hypothesis that the fixed point equation is defined by a contraction mapping, we present a unified analysis which allows for the first time to quantitatively compare these methods, providing explicit bounds for their iteration complexity. This analysis suggests a hierarchy in terms of computational efficiency among the above methods, with approximate implicit differentiation based on conjugate gradient performing best. We present an extensive experimental comparison among the methods which confirm the theoretical findings.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] THE COMPUTATION OF RADIATION POLARIZATION CHARACTERISTICS BY THE ITERATION METHOD
    STRELKOV, SA
    SUSHKEVICH, TA
    IZVESTIYA AKADEMII NAUK SSSR FIZIKA ATMOSFERY I OKEANA, 1983, 19 (03): : 322 - 324
  • [32] An Iteration Method for Computation of Flexible Fender Piles
    HU Liwan and LIU Xianzhi Associate Professor
    China Ocean Engineering, 1998, (04) : 435 - 442
  • [33] Complexity generated by iteration of hierarchical modules in bryozoa
    Hageman, SJ
    INTEGRATIVE AND COMPARATIVE BIOLOGY, 2003, 43 (01) : 87 - 98
  • [34] CONVERGENCE AND COMPLEXITY OF NEWTON ITERATION FOR OPERATOR EQUATIONS
    TRAUB, JF
    WOZNIAKOWSKI, H
    JOURNAL OF THE ACM, 1979, 26 (02) : 250 - 258
  • [35] Randomness complexity of private computation
    C. Blundo
    A. De Santis
    G. Persiano
    U. Vaccaro
    computational complexity, 1999, 8 : 145 - 168
  • [36] Complexity and computation in matrix groups
    Niemeyer, AC
    Praeger, CE
    ASPECTS OF COMPLEXITY: MINICOURSES IN ALGORITHMICS, COMPLEXITY AND COMPUTATIONAL ALGEBRA, 2001, 4 : 87 - 113
  • [37] On the Communication Complexity of Secure Computation
    Data, Deepesh
    Prabhakaran, Manoj M.
    Prabhakaran, Vinod M.
    ADVANCES IN CRYPTOLOGY - CRYPTO 2014, PT II, 2014, 8617 : 199 - 216
  • [38] Complexity and real computation: A manifesto
    Blum, L
    Cucker, F
    Shub, M
    Smale, S
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 1996, 6 (01): : 3 - 26
  • [39] OVERVIEW OF COMPLEXITY AND ITS COMPUTATION
    Costa, M.
    GERONTOLOGIST, 2011, 51 : 606 - 607
  • [40] On the Round Complexity of Covert Computation
    Goyal, Vipul
    Jain, Abhishek
    STOC 2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2010, : 191 - 200