Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

被引：0

作者：

Kinoshita, Yuri ^{[1
]}

Suzuki, Taiji ^{[1
,2
]}

机构：

[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Math Informat, Tokyo, Japan

[2] RIKEN, Ctr Adv Intelligence Project, Tokyo, Japan

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

关键词：

GLOBAL OPTIMIZATION; INEQUALITY; POINCARE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. We prove their convergence to the objective distribution in terms of KL-divergence under the sole assumptions of smoothness and Log-Sobolev inequality which are weaker conditions than those used in prior works for these algorithms. With the batch size and the inner loop length set to root n, the gradient complexity to achieve an epsilon-precision is (O) over tilde ((n + dn(1/2) epsilon(-1))gamma L-2(2) alpha(-2)), which is an improvement from any previous analyses. We also show some essential applications of our result to non-convex optimization.

引用

页数：13

共 50 条

[1] Stochastic Gradient Langevin Dynamics with Variance Reduction
Huang, Zhishen
Becker, Stephen
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[2] Variance Reduction in Stochastic Gradient Langevin Dynamics
Dubey, Avinava
Reddi, Sashank J.
Poczos, Barnabas
Smola, Alexander J.
Xing, Eric P.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[3] Convergence of Mean-field Langevin dynamics: Time-space discretization, stochastic gradient, and variance reduction
Suzuki, Taiji
Wu, Denny
Nitanda, Atsushi
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[4] Evaluating and Diagnosing Convergence for Stochastic Gradient Langevin Dynamics
Hernandez, Sergio
Luis Lopez, Juan
[J]. 2021 40TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2021,
[5] Subsampled Stochastic Variance-Reduced Gradient Langevin Dynamics
Zou, Difan
Xu, Pan
Gu, Quanquan
[J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 508 - 518
[6] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
Chen, Peng
Lu, Jianya
Xu, Lihu
[J]. APPLIED MATHEMATICS AND OPTIMIZATION, 2022, 85 (02):
[7] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
Peng Chen
Jianya Lu
Lihu Xu
[J]. Applied Mathematics & Optimization, 2022, 85
[8] Exploration of the (Non-)Asymptotic Bias and Variance of Stochastic Gradient Langevin Dynamics
Vollmer, Sebastian J.
Zygalakis, Konstantinos C.
Teh, Yee Whye
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17 : 1 - 45
[9] An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Xu, Pan
Gao, Felicia
Gu, Quanquan
[J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 541 - 551
[10] Nonconvex optimization with inertial proximal stochastic variance reduction gradient
He, Lulu
Ye, Jimin
Jianwei, E.
[J]. INFORMATION SCIENCES, 2023, 648

← 1 2 3 4 5 →