Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning

被引:1
|
作者
Yang, Zhuang [1 ,2 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[2] Sun Yat sen Univ, Sch Elect & Commun Engn, Guangzhou 510275, Peoples R China
基金
中国博士后科学基金;
关键词
Adaptive step size; machine learning; mini-batches; stochastic conjugate gradient (SCG); variance reduction; MINI-BATCH ALGORITHMS; SIZE SELECTION; SEARCH;
D O I
10.1109/TNNLS.2023.3280826
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conjugate gradient (CG), as an effective technique to speed up gradient descent algorithms, has shown great potential and has widely been used for large-scale machine-learning problems. However, CG and its variants have not been devised for the stochastic setting, which makes them extremely unstable, and even leads to divergence when using noisy gradients. This article develops a novel class of stable stochastic CG (SCG) algorithms with a faster convergence rate via the variance-reduced technique and an adaptive step size rule in the mini-batch setting. Actually, replacing the use of a line search in the CG-type approaches which is time-consuming, or even fails for SCG, this article considers using the random stabilized Barzilai-Borwein (RSBB) method to obtain an online step size. We rigorously analyze the convergence properties of the proposed algorithms and show that the proposed algorithms attain a linear convergence rate for both the strongly convex and nonconvex settings. Also, we show that the total complexity of the proposed algorithms matches that of modern stochastic optimization algorithms under different cases. Scores of numerical experiments on machine-learning problems demonstrate that the proposed algorithms outperform state-of-the-art stochastic optimization algorithms.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] Large-scale machine learning with fast and stable stochastic conjugate gradient
    Yang, Zhuang
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 173
  • [2] Adaptive Powerball Stochastic Conjugate Gradient for Large-Scale Learning
    Yang, Zhuang
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (06) : 1598 - 1606
  • [3] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [4] An online conjugate gradient algorithm for large-scale data analysis in machine learning
    Xue, Wei
    Wan, Pengcheng
    Li, Qiao
    Zhong, Ping
    Yu, Gaohang
    Tao, Tao
    [J]. AIMS MATHEMATICS, 2021, 6 (02): : 1515 - 1537
  • [5] Adaptive stochastic conjugate gradient for machine learning
    Yang, Zhuang
    [J]. Expert Systems with Applications, 2022, 206
  • [6] Adaptive stochastic conjugate gradient for machine learning
    Yang, Zhuang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 206
  • [7] MEAN-NORMALIZED STOCHASTIC GRADIENT FOR LARGE-SCALE DEEP LEARNING
    Wiesler, Simon
    Richard, Alexander
    Schlueter, Ralf
    Ney, Hermann
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] Stochastic Conjugate Gradient Descent Twin Support Vector Machine for Large Scale Pattern Classification
    Sharma, Sweta
    Rastogi, Reshma
    [J]. AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 590 - 602
  • [9] Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning
    Yang, Zhuang
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [10] Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning
    Liu, Yuanyuan
    Shang, Fanhua
    Liu, Hongying
    Kong, Lin
    Jiao, Licheng
    Lin, Zhouchen
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4242 - 4255