Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization

被引：0

作者：

Fang, Cong

Lin, Zhouchen ^{[1
]}

机构：

[1] Peking Univ, Sch EECS, Key Lab Machine Percept MOE, Beijing, Peoples R China

来源：

THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nowadays, asynchronous parallel algorithms have received much attention in the optimization field due to the crucial demands for modern large-scale optimization problems. However, most asynchronous algorithms focus on convex problems. Analysis on nonconvex problems is lacking. For the Asynchronous Stochastic Descent (ASGD) algorithm, the best result from (Lian et al. 2015) can only achieve an asymptotic O(1/epsilon(2)) rate (convergence to the stationary points, namely, parallel to del f(x)parallel to(2) <= c) on nonconvex problems. In this paper, we study Stochastic Variance Reduced Gradient (SVRG) in the asynchronous setting. We propose the Asynchronous Stochastic Variance Reduced Gradient (ASVRG) algorithm for nonconvex finite-sum problems. We develop two schemes for ASVRG, depending on whether the parameters are updated as an atom or not. We prove that both of the two schemes can achieve linear speed up(1)(a non-asymptotic O(n(2/3)/epsilon) rate to the stationary points) for nonconvex problems when the delay parameter tau < n(1/3), where n is the number of training samples. We also establish a non-asymptotic O(n(2/3) tau(1/3)/epsilon) rate (convergence to the stationary points) for our algorithm without assumptions on t. This further demonstrates that even with asynchronous updating, SVRG has less number of Incremental First-order Oracles (IFOs) compared with Stochastic Gradient Descent and Gradient Descent. We also conduct experiments on a shared memory multi-core system to demonstrate the efficiency of our algorithm.

引用

页码：794 / 800

页数：7

共 50 条

[31] Improved asynchronous parallel optimization analysis for stochastic incremental methods
Leblond, Rémi
Pedregosa, Fabian
Lacoste-Julien, Simon
Journal of Machine Learning Research, 2018, 19
[32] Accelerated proximal stochastic variance reduction for DC optimization
Lulu He
Jimin Ye
Jianwei E
Neural Computing and Applications, 2021, 33 : 13163 - 13181
[33] Accelerated proximal stochastic variance reduction for DC optimization
He, Lulu
Ye, Jimin
Jianwei, E.
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (20): : 13163 - 13181
[34] Decentralized Stochastic Optimization With Pairwise Constraints and Variance Reduction
Han, Fei
Cao, Xuanyu
Gong, Yi
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 1960 - 1973
[35] Parallel Asynchronous Lock-Free Algorithms for Nonconvex Big-Data Optimization
Cannelli, Loris
Scutari, Gesualdo
Facchinei, Francisco
Kungurtsev, Vyacheslav
2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 1009 - 1013
[36] An Effective Hard Thresholding Method Based on Stochastic Variance Reduction for Nonconvex Sparse Learning
Liang, Guannan
Tong, Qianqian
Zhu, Chunjiang
Bi, Jinbo
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 1585 - 1592
[37] FAST DECENTRALIZED NONCONVEX FINITE-SUM OPTIMIZATION WITH RECURSIVE VARIANCE REDUCTION
Xin, Ran
Khan, Usman A.
Kar, Soummya
SIAM JOURNAL ON OPTIMIZATION, 2022, 32 (01) : 1 - 28
[38] Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization
Zhang, Junyu
Xiao, Lin
MATHEMATICAL PROGRAMMING, 2022, 195 (1-2) : 649 - 691
[39] VARIABLE METRIC PROXIMAL STOCHASTIC VARIANCE REDUCED GRADIENT METHODS FOR NONCONVEX NONSMOOTH OPTIMIZATION
Yu, Tengteng
Liu, Xin-wei
Dai, Yu-hong
Sun, J. I. E.
JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2022, 18 (04) : 2611 - 2631
[40] Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization
Junyu Zhang
Lin Xiao
Mathematical Programming, 2022, 195 : 649 - 691

← 1 2 3 4 5 →