Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization

被引:0
|
作者
Fang, Cong
Lin, Zhouchen [1 ]
机构
[1] Peking Univ, Sch EECS, Key Lab Machine Percept MOE, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field due to the crucial demands for modern large-scale optimization problems. However, most asynchronous algorithms focus on convex problems. Analysis on nonconvex problems is lacking. For the Asynchronous Stochastic Descent (ASGD) algorithm, the best result from (Lian et al. 2015) can only achieve an asymptotic O(1/epsilon(2)) rate (convergence to the stationary points, namely, parallel to del f(x)parallel to(2) <= c) on nonconvex problems. In this paper, we study Stochastic Variance Reduced Gradient (SVRG) in the asynchronous setting. We propose the Asynchronous Stochastic Variance Reduced Gradient (ASVRG) algorithm for nonconvex finite-sum problems. We develop two schemes for ASVRG, depending on whether the parameters are updated as an atom or not. We prove that both of the two schemes can achieve linear speed up(1)(a non-asymptotic O(n(2/3)/epsilon) rate to the stationary points) for nonconvex problems when the delay parameter tau < n(1/3), where n is the number of training samples. We also establish a non-asymptotic O(n(2/3) tau(1/3)/epsilon) rate (convergence to the stationary points) for our algorithm without assumptions on t. This further demonstrates that even with asynchronous updating, SVRG has less number of Incremental First-order Oracles (IFOs) compared with Stochastic Gradient Descent and Gradient Descent. We also conduct experiments on a shared memory multi-core system to demonstrate the efficiency of our algorithm.
引用
收藏
页码:794 / 800
页数:7
相关论文
共 50 条
  • [1] Stochastic Variance Reduction for Nonconvex Optimization
    Reddi, Sashank J.
    Hefny, Ahmed
    Sra, Suvrit
    Poczos, Barnabas
    Smola, Alex
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [2] Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
    Lian, Xiangru
    Huang, Yijun
    Li, Yuncheng
    Liu, Ji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [3] Stochastic Nested Variance Reduction for Nonconvex Optimization
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [4] Stochastic Nested Variance Reduction for Nonconvex Optimization
    Zhou, Dongruo
    Xu, Pan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [5] Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction
    Niwa, Kenta
    Zhang, Guoqiang
    Kleijn, W. Bastiaan
    Harada, Noboru
    Sawada, Hiroshi
    Fujino, Akinori
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
    Liu, Sijia
    Kailkhura, Bhavya
    Chen, Pin-Yu
    Ting, Paishun
    Chang, Shiyu
    Amini, Lisa
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [7] Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction
    Meng, Qi
    Chen, Wei
    Yu, Jingcheng
    Wang, Taifeng
    Ma, Zhi-Ming
    Liu, Tie-Yan
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2329 - 2335
  • [8] Nonconvex optimization with inertial proximal stochastic variance reduction gradient
    He, Lulu
    Ye, Jimin
    Jianwei, E.
    INFORMATION SCIENCES, 2023, 648
  • [9] Asynchronous parallel algorithms for nonconvex optimization
    Cannelli, Loris
    Facchinei, Francisco
    Kungurtsev, Vyacheslav
    Scutari, Gesualdo
    MATHEMATICAL PROGRAMMING, 2020, 184 (1-2) : 121 - 154
  • [10] Asynchronous parallel algorithms for nonconvex optimization
    Loris Cannelli
    Francisco Facchinei
    Vyacheslav Kungurtsev
    Gesualdo Scutari
    Mathematical Programming, 2020, 184 : 121 - 154