Variance-Reduced Methods for Machine Learning

被引:49
|
作者
Gower, Robert M. [1 ]
Schmidt, Mark [2 ]
Bach, Francis [3 ]
Richtarik, Peter [4 ]
机构
[1] Inst Polytech Paris, LTCI Telecom Paris, F-75634 Paris, France
[2] Univ British Columbia, CCAI Affiliate Chair Amii, Vancouver, BC V6T 1Z4, Canada
[3] PSL Res Univ, INRIA, F-75006 Paris, France
[4] King Abdullah Univ Sci & Technol, Dept Comp Sci, Thuwal 23955, Saudi Arabia
关键词
Machine learning; Optimization; Data models; Computational modeling; Logistics; Stochastic processes; Training data; optimization; variance reduction; QUASI-NEWTON METHODS; GRADIENT-METHOD; OPTIMIZATION; MINIMIZATION; CONVERGENCE; DESCENT;
D O I
10.1109/JPROC.2020.3028013
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight years have seen an exciting new development: variance reduction for stochastic optimization methods. These variance-reduced (VR) methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory and practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at nonexpert readers. We focus mainly on the convex setting and leave pointers to readers interested in extensions for minimizing nonconvex functions.
引用
收藏
页码:1968 / 1983
页数:16
相关论文
共 50 条
  • [31] Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization
    Wang, Zhe
    Zhou, Yi
    Liang, Yingbin
    Lan, Guanghui
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [32] VARIANCE-REDUCED SIMULATION OF MULTISCALE TUMOR GROWTH MODELING
    Lejon, Annelies
    Mortier, Bert
    Samaey, Giovanni
    MULTISCALE MODELING & SIMULATION, 2017, 15 (01): : 388 - 409
  • [33] Variance-Reduced Decentralized Stochastic Optimization With Accelerated Convergence
    Xin, Ran
    Khan, Usman A.
    Kar, Soummya
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 6255 - 6271
  • [34] Variance-Reduced Stochastic Gradient Descent on Streaming Data
    Jothimurugesan, Ellango
    Tahmasbi, Ashraf
    Gibbons, Phillip B.
    Tirthapura, Srikanta
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [35] Variance-Reduced and Projection-Free Stochastic Optimization
    Hazan, Elad
    Luo, Haipeng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [36] Estimate Sequences for Variance-Reduced Stochastic Composite Optimization
    Kulunchakov, Andrei
    Mairal, Julien
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [37] Stochastic Variance-Reduced Majorization-Minimization Algorithms
    Phan, Duy Nhat
    Bartz, Sedi
    Guha, Nilabja
    Phan, Hung M.
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2024, 6 (04): : 926 - 952
  • [38] Subsampled Stochastic Variance-Reduced Gradient Langevin Dynamics
    Zou, Difan
    Xu, Pan
    Gu, Quanquan
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 508 - 518
  • [39] Variance-Reduced Accelerated First-Order Methods: Central Limit Theorems and Confidence Statements
    Lei, Jinlong
    Shanbhag, Uday V.
    MATHEMATICS OF OPERATIONS RESEARCH, 2024,
  • [40] Pricing high-dimensional Bermudan options using variance-reduced Monte Carlo methods
    Hepperger, Peter
    JOURNAL OF COMPUTATIONAL FINANCE, 2013, 16 (03) : 99 - 126