Towards closing the gap between the theory and practice of SVRG

被引:0
|
作者
Sebbouh, Othmane [1 ]
Gazagnadou, Nidham [1 ]
Jelassi, Samy [2 ]
Bach, Francis [3 ]
Gower, Robert M. [1 ]
机构
[1] Telecom Paris, Inst Polytech Paris, LTCI, Paris, France
[2] Princeton Univ, ORFE Dept, Princeton, NJ 08544 USA
[3] PSL Res Univ, INRIA Ecole Normale Super, Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Amongst the very first variance reduced stochastic methods for solving the empirical risk minimization problem was the SVRG method [13]. SVRG is an inner-outer loop based method, where in the outer loop a reference full gradient is evaluated, after which m 2 N steps of an inner loop are executed where the reference gradient is used to build a variance reduced estimate of the current gradient. The simplicity of the SVRG method and its analysis have lead to multiple extensions and variants for even non-convex optimization. Yet there is a significant gap between the parameter settings that the analysis suggests and what is known to work well in practice. Our first contribution is that we take several steps towards closing this gap. In particular, the current analysis shows that m should be of the order of the condition number so that the resulting method has a favorable complexity. Yet in practice m = n works well regardless of the condition number, where n is the number of data points. Furthermore, the current analysis shows that the inner iterates have to be reset using averaging after every outer loop. Yet in practice SVRG works best when the inner iterates are updated continuously and not reset. We provide an analysis of these aforementioned practical settings and show that they achieve the same favorable complexity as the original analysis (with slightly better constants). Our second contribution is to provide a more general analysis than had been previously done by using arbitrary sampling, which allows us to analyse virtually all forms of mini-batching through a single theorem. Since our setup and analysis reflect what is done in practice, we are able to set the parameters such as the mini-batch size and step size using our theory in such a way that produces a more efficient algorithm in practice, as we show in extensive numerical experiments.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] CLOSING THE GAP BETWEEN IPM THEORY AND PRACTICE
    PEDIGO, LP
    [J]. JOURNAL OF AGRICULTURAL ENTOMOLOGY, 1995, 12 (04): : 171 - 181
  • [2] CLOSING GAP BETWEEN EDUCATIONAL THEORY AND PRACTICE
    BINGMAN, RM
    [J]. SCHOOL AND COMMUNITY, 1971, 57 (07): : 32 - &
  • [3] Humanitarian Intervention: Closing the Gap Between Theory and Practice
    Brock, Gillian
    [J]. JOURNAL OF APPLIED PHILOSOPHY, 2006, 23 (03) : 277 - 291
  • [4] Closing the gap between theory and practice in public health
    Leon, Juan Andres
    [J]. GACETA SANITARIA, 2007, 21 (04) : 361 - 361
  • [5] CLOSING THE GAP BETWEEN THEORY AND PRACTICE WITH BETTER PSYCHOANALYTIC-THEORY
    STOLOROW, RD
    [J]. PSYCHOTHERAPY, 1992, 29 (02) : 159 - 166
  • [6] Taxation and Global Justice: Closing the Gap between Theory and Practice
    Brock, Gillian
    [J]. JOURNAL OF SOCIAL PHILOSOPHY, 2008, 39 (02) : 161 - 184
  • [7] Inverse problems in geophysics: closing the gap between theory and practice
    Snieder, R
    Sambridge, M
    Sanso, F
    [J]. INVERSE PROBLEMS, 1998, 14 (03) : 369 - 370
  • [8] MIP: Theory and practice closing the gap
    Bixby, RE
    Fenelon, M
    Gu, ZH
    Rothberg, E
    Wunderling, R
    [J]. SYSTEM MODELLING AND OPTIMIZATION: METHODS, THEORY AND APPLICATIONS, 2000, 46 : 19 - 49
  • [9] RESERVOIR SYSTEMS-ANALYSIS - CLOSING GAP BETWEEN THEORY AND PRACTICE
    SIMONOVIC, SP
    [J]. JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT-ASCE, 1992, 118 (03): : 262 - 280
  • [10] Closing the Gap Between Theory and Practice During Alternating Optimization for GANs
    Chen, Yuanqi
    Sun, Shangkun
    Li, Ge
    Gao, Wei
    Li, Thomas H.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (10) : 1 - 13