Improved Variance Reduction Methods for Riemannian Non-Convex Optimization

被引:4
|
作者
Han, Andi [1 ]
Gao, Junbin [1 ]
机构
[1] Univ Sydney, Business Sch, Discipline Business Analyt, Sydney, NSW 2006, Australia
基金
澳大利亚研究理事会;
关键词
Complexity theory; Optimization; Manifolds; Convergence; Convex functions; Training; Principal component analysis; Riemannian optimization; non-convex optimization; online optimization; variance reduction; batch size adaptation;
D O I
10.1109/TPAMI.2021.3112139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Variance reduction is popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both euclidean space and Riemannian manifold. This paper further improves on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER by providing a unified framework for batch size adaptation. Such framework is more general than the existing works by considering retraction and vector transport and mini-batch stochastic gradients. We show that the adaptive-batch variance reduction methods require lower gradient complexities for both general non-convex and gradient dominated functions, under both finite-sum and online optimization settings. Moreover, under the new framework, we complete the analysis of R-SVRG and R-SRG, which is currently missing in the literature. We prove convergence of R-SVRG with much simpler analysis, which leads to curvature-free complexity bounds. We also show improved results for R-SRG under double-loop convergence, which match the optimal complexities as the R-SPIDER. In addition, we prove the first online complexity results for R-SVRG and R-SRG. Lastly, we discuss the potential of adapting batch size for non-smooth, constrained and second-order Riemannian optimizers. Extensive experiments on a variety of applications support the analysis and claims in the paper.
引用
下载
收藏
页码:7610 / 7623
页数:14
相关论文
共 50 条
  • [1] Variance Reduction for Faster Non-Convex Optimization
    Allen-Zhu, Zeyuan
    Hazan, Elad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [2] Variance Reduced Methods for Non-Convex Composition Optimization
    Liu, Liu
    Liu, Ji
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 5813 - 5825
  • [3] Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization
    Jiang, Xia
    Zeng, Xianlin
    Sun, Jian
    Chen, Jie
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 5310 - 5321
  • [4] Gradient Methods for Non-convex Optimization
    Jain, Prateek
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 247 - 256
  • [5] Gradient Methods for Non-convex Optimization
    Prateek Jain
    Journal of the Indian Institute of Science, 2019, 99 : 247 - 256
  • [6] Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds
    Zhou, Pan
    Yuan, Xiao-Tong
    Yan, Shuicheng
    Feng, Jiashi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 459 - 472
  • [7] Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds
    Zhou, Pan
    Yuan, Xiao-Tong
    Feng, Jiashi
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 138 - 147
  • [8] Riemannian Stochastic Recursive Momentum Method for non-Convex Optimization
    Han, Andi
    Gao, Junbin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2505 - 2511
  • [9] Stochastic variable metric proximal gradient with variance reduction for non-convex composite optimization
    Gersende Fort
    Eric Moulines
    Statistics and Computing, 2023, 33 (3)
  • [10] Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization
    Huo, Zhouyuan
    Huang, Heng
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2043 - 2049