Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

被引:0
|
作者
Liu, Jin [1 ]
Pan, Xiaokang [1 ]
Duan, Junwen [1 ]
Li, Hong-Dong [1 ]
Li, Youqi [2 ]
Qu, Zhe [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper delves into the realm of stochastic optimization for compositional minimax optimization-a pivotal challenge across various machine learning domains, including deep AUC and reinforcement learning policy evaluation. Despite its significance, the problem of compositional minimax optimization is still under-explored. Adding to the complexity, current methods of compositional minimax optimization are plagued by sub-optimal complexities or heavy reliance on sizable batch sizes. To respond to these constraints, this paper introduces a novel method, called Nested STOchastic Recursive Momentum (NSTORM), which can achieve the optimal sample complexity and obtain the nearly accuracy solution, matching the existing minimax methods. We also demonstrate that NSTORM can achieve the same sample complexity under the Polyak-Lojasiewicz (PL)-condition-an insightful extension of its capabilities. Yet, NSTORM encounters an issue with its requirement for low learning rates, potentially constraining its real-world applicability in machine learning. To overcome this hurdle, we present ADAptive NSTORM (ADA-NSTORM) with adaptive learning rates. We demonstrate that ADA-NSTORM can achieve the same sample complexity but the experimental results show its more effectiveness. All the proposed complexities indicate that our proposed methods can match lower bounds to existing minimax optimizations, without requiring a large batch size in each iteration. Extensive experiments support the efficiency of our proposed methods.
引用
收藏
页码:13927 / 13935
页数:9
相关论文
共 50 条
  • [41] General inertial proximal stochastic variance reduction gradient for nonconvex nonsmooth optimization
    Sun, Shuya
    He, Lulu
    [J]. JOURNAL OF INEQUALITIES AND APPLICATIONS, 2023, 2023 (01)
  • [42] Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise
    Kulunchakov, Andrei
    Mairal, Julien
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [43] General inertial proximal stochastic variance reduction gradient for nonconvex nonsmooth optimization
    Shuya Sun
    Lulu He
    [J]. Journal of Inequalities and Applications, 2023
  • [44] Estimate sequences for stochastic composite optimization: Variance reduction, acceleration, and robustness to noise
    Kulunchakov, Andrei
    Mairal, Julien
    [J]. Journal of Machine Learning Research, 2020, 21
  • [45] SAAGs: Biased stochastic variance reduction methods for large-scale learning
    Vinod Kumar Chauhan
    Anuj Sharma
    Kalpana Dahiya
    [J]. Applied Intelligence, 2019, 49 : 3331 - 3361
  • [46] Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
    Robert M. Gower
    Peter Richtárik
    Francis Bach
    [J]. Mathematical Programming, 2021, 188 : 135 - 192
  • [47] Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
    Gower, Robert M.
    Richtarik, Peter
    Bach, Francis
    [J]. MATHEMATICAL PROGRAMMING, 2021, 188 (01) : 135 - 192
  • [48] SAAGs: Biased stochastic variance reduction methods for large-scale learning
    Chauhan, Vinod Kumar
    Sharma, Anuj
    Dahiya, Kalpana
    [J]. APPLIED INTELLIGENCE, 2019, 49 (09) : 3331 - 3361
  • [49] SWARMING FOR FASTER CONVERGENCE IN STOCHASTIC OPTIMIZATION
    Pu, Shi
    Garcia, Alfredo
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2018, 56 (04) : 2997 - 3020
  • [50] Local Minimax Complexity of Stochastic Convex Optimization
    Zhu, Yuancheng
    Chatterjee, Sabyasachi
    Duchi, John
    Lafferty, John
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29