Variance-Reduced Methods for Machine Learning

被引:49
|
作者
Gower, Robert M. [1 ]
Schmidt, Mark [2 ]
Bach, Francis [3 ]
Richtarik, Peter [4 ]
机构
[1] Inst Polytech Paris, LTCI Telecom Paris, F-75634 Paris, France
[2] Univ British Columbia, CCAI Affiliate Chair Amii, Vancouver, BC V6T 1Z4, Canada
[3] PSL Res Univ, INRIA, F-75006 Paris, France
[4] King Abdullah Univ Sci & Technol, Dept Comp Sci, Thuwal 23955, Saudi Arabia
关键词
Machine learning; Optimization; Data models; Computational modeling; Logistics; Stochastic processes; Training data; optimization; variance reduction; QUASI-NEWTON METHODS; GRADIENT-METHOD; OPTIMIZATION; MINIMIZATION; CONVERGENCE; DESCENT;
D O I
10.1109/JPROC.2020.3028013
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight years have seen an exciting new development: variance reduction for stochastic optimization methods. These variance-reduced (VR) methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory and practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at nonexpert readers. We focus mainly on the convex setting and leave pointers to readers interested in extensions for minimizing nonconvex functions.
引用
收藏
页码:1968 / 1983
页数:16
相关论文
共 50 条
  • [1] An accelerated stochastic variance-reduced method for machine learning problems
    Yang, Zhuang
    Chen, Zengping
    Wang, Cheng
    KNOWLEDGE-BASED SYSTEMS, 2020, 198
  • [2] Variance-Reduced Stochastic Quasi-Newton Methods for Decentralized Learning
    Zhang, Jiaojiao
    Liu, Huikang
    So, Anthony Man-Cho
    Ling, Qing
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 311 - 326
  • [3] Accelerating variance-reduced stochastic gradient methods
    Derek Driggs
    Matthias J. Ehrhardt
    Carola-Bibiane Schönlieb
    Mathematical Programming, 2022, 191 : 671 - 715
  • [4] Accelerating variance-reduced stochastic gradient methods
    Driggs, Derek
    Ehrhardt, Matthias J.
    Schonlieb, Carola-Bibiane
    MATHEMATICAL PROGRAMMING, 2022, 191 (02) : 671 - 715
  • [5] Stochastic Variance-Reduced Cubic Regularization Methods
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [6] Stochastic variance-reduced cubic regularization methods
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    Journal of Machine Learning Research, 2019, 20
  • [7] Stochastic Recursive Variance-Reduced Cubic Regularization Methods
    Zhou, Dongruo
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3980 - 3989
  • [8] Stochastic Variance-Reduced Cubic Regularized Newton Methods
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [9] Variance-reduced particle methods for solving the Boltzmann equation
    Baker, Lowell L.
    Hadjiconstantinou, Nicolas G.
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2008, 5 (02) : 165 - 174
  • [10] Stochastic Variance-Reduced Hamilton Monte Carlo Methods
    Zou, Difan
    Xu, Pan
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80