Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

被引：0

作者：

Lian, Xiangru ^{[1
]}

Huang, Yijun ^{[1
]}

Li, Yuncheng ^{[1
]}

Liu, Ji ^{[1
]}

机构：

[1] Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015) | 2015年 / 28卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provide theoretical supports, this paper studies two asynchronous parallel implementations of SG: one is over a computer network and the other is on a shared memory system. We establish an ergodic convergence rate O (1/root K) for both algorithms and prove that the linear speedup is achievable if the number of workers is bounded by root K (K is the total number of iterations). Our results generalize and improve existing analysis for convex minimization.

引用

页数：9

共 50 条

[1] Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization
Fang, Cong
Lin, Zhouchen
[J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 794 - 800
[2] Asynchronous parallel algorithms for nonconvex optimization
Cannelli, Loris
Facchinei, Francisco
Kungurtsev, Vyacheslav
Scutari, Gesualdo
[J]. MATHEMATICAL PROGRAMMING, 2020, 184 (1-2) : 121 - 154
[3] Asynchronous parallel algorithms for nonconvex optimization
Loris Cannelli
Francisco Facchinei
Vyacheslav Kungurtsev
Gesualdo Scutari
[J]. Mathematical Programming, 2020, 184 : 121 - 154
[4] Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
Akyildiz, Omer Deniz
Crisan, Dan
Miguez, Joaquin
[J]. STATISTICS AND COMPUTING, 2020, 30 (06) : 1645 - 1663
[5] Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
Ömer Deniz Akyildiz
Dan Crisan
Joaquín Míguez
[J]. Statistics and Computing, 2020, 30 : 1645 - 1663
[6] Decentralized Asynchronous Nonconvex Stochastic Optimization on Directed Graphs
Kungurtsev, Vyacheslav
Morafah, Mahdi
Javidi, Tara
Scutari, Gesualdo
[J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (04): : 1796 - 1804
[7] ASYNCHRONOUS PARALLEL NONCONVEX LARGE-SCALE OPTIMIZATION
Cannelli, L.
Facchinei, F.
Kungurtsev, V.
Scutari, G.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4706 - 4710
[8] Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization
Horvath, Samuel
Lei, Lihua
Richtarik, Peter
Jordan, Michael I.
[J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (02): : 634 - 648
[9] Asynchronous Decentralized Parallel Stochastic Gradient Descent
Lian, Xiangru
Zhang, Wei
Zhang, Ce
Liu, Ji
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[10] Stochastic generalized gradient method for nonconvex nonsmooth stochastic optimization
Yu. M. Ermol'ev
V. I. Norkin
[J]. Cybernetics and Systems Analysis, 1998, 34 : 196 - 215

← 1 2 3 4 5 →