Improved Variance Reduction Methods for Riemannian Non-Convex Optimization

被引:4
|
作者
Han, Andi [1 ]
Gao, Junbin [1 ]
机构
[1] Univ Sydney, Business Sch, Discipline Business Analyt, Sydney, NSW 2006, Australia
基金
澳大利亚研究理事会;
关键词
Complexity theory; Optimization; Manifolds; Convergence; Convex functions; Training; Principal component analysis; Riemannian optimization; non-convex optimization; online optimization; variance reduction; batch size adaptation;
D O I
10.1109/TPAMI.2021.3112139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Variance reduction is popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both euclidean space and Riemannian manifold. This paper further improves on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER by providing a unified framework for batch size adaptation. Such framework is more general than the existing works by considering retraction and vector transport and mini-batch stochastic gradients. We show that the adaptive-batch variance reduction methods require lower gradient complexities for both general non-convex and gradient dominated functions, under both finite-sum and online optimization settings. Moreover, under the new framework, we complete the analysis of R-SVRG and R-SRG, which is currently missing in the literature. We prove convergence of R-SVRG with much simpler analysis, which leads to curvature-free complexity bounds. We also show improved results for R-SRG under double-loop convergence, which match the optimal complexities as the R-SPIDER. In addition, we prove the first online complexity results for R-SVRG and R-SRG. Lastly, we discuss the potential of adapting batch size for non-smooth, constrained and second-order Riemannian optimizers. Extensive experiments on a variety of applications support the analysis and claims in the paper.
引用
下载
收藏
页码:7610 / 7623
页数:14
相关论文
共 50 条
  • [11] Stochastic variable metric proximal gradient with variance reduction for non-convex composite optimization
    Fort, Gersende
    Moulines, Eric
    STATISTICS AND COMPUTING, 2023, 33 (03)
  • [12] Momentum-Based Variance Reduction in Non-Convex SGD
    Cutkosky, Ashok
    Orabona, Francesco
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [13] Improved Analysis of Clipping Algorithms for Non-convex Optimization
    Zhang, Bohang
    Jin, Jikai
    Fang, Cong
    Wang, Liwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [14] Constrained Non-convex Optimization via Stochastic Variance Reduced Approximations
    Nutalapati, Mohan Krishna
    Krishna, Muppavaram Sai
    Samanta, Atanu
    Rajawat, Ketan
    2019 SIXTH INDIAN CONTROL CONFERENCE (ICC), 2019, : 293 - 298
  • [15] Private Stochastic Non-convex Optimization with Improved Utility Rates
    Zhang, Qiuchen
    Ma, Jing
    Lou, Jian
    Xiong, Li
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3370 - 3376
  • [16] Improved Time of Arrival measurement model for non-convex optimization
    Sidorenko, Juri
    Schatz, Volker
    Doktorski, Leo
    Scherer-Negenborn, Norbert
    Arens, Michael
    Hugentobler, Urs
    NAVIGATION-JOURNAL OF THE INSTITUTE OF NAVIGATION, 2019, 66 (01): : 117 - 128
  • [17] Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
    Kavis, Ali
    Skoulakis, Stratis
    Antonakopoulos, Kimon
    Dadi, Leello Tadesse
    Cevher, Volkan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [18] Non-Convex Optimization: A Review
    Trehan, Dhruv
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 418 - 423
  • [19] Non-convex scenario optimization
    Garatti, Simone
    Campi, Marco C.
    MATHEMATICAL PROGRAMMING, 2024, 209 (1) : 557 - 608
  • [20] Non-Convex Distributed Optimization
    Tatarenko, Tatiana
    Touri, Behrouz
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3744 - 3757