Explicit stabilised gradient descent for faster strongly convex optimisation

被引:7
|
作者
Eftekhari, Armin [1 ]
Vandereycken, Bart [2 ]
Vilmart, Gilles [2 ]
Zygalakis, Konstantinos C. [3 ]
机构
[1] Umea Univ, Dept Math & Math Stat, S-90187 Umea, Sweden
[2] Univ Geneva, Sect Math, CP 64, CH-1211 Geneva 4, Switzerland
[3] Univ Edinburgh, Sch Math, Edinburgh EH9 3FD, Midlothian, Scotland
基金
瑞士国家科学基金会; 英国工程与自然科学研究理事会;
关键词
Runge-Kutta methods; Strongly convex optimization; Accelerated gradient descent; CHEBYSHEV METHODS; STIFF;
D O I
10.1007/s10543-020-00819-y
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper introduces the Runge-Kutta Chebyshev descent method (RKCD) for strongly convex optimisation problems. This new algorithm is based on explicit stabilised integrators for stiff differential equations, a powerful class of numerical schemes that avoid the severe step size restriction faced by standard explicit integrators. For optimising quadratic and strongly convex functions, this paper proves that RKCD nearly achieves the optimal convergence rate of the conjugate gradient algorithm, and the suboptimality of RKCD diminishes as the condition number of the quadratic function worsens. It is established that this optimal rate is obtained also for a partitioned variant of RKCD applied to perturbations of quadratic functions. In addition, numerical experiments on general strongly convex problems show that RKCD outperforms Nesterov's accelerated gradient descent.
引用
收藏
页码:119 / 139
页数:21
相关论文
共 50 条
  • [1] Explicit stabilised gradient descent for faster strongly convex optimisation
    Armin Eftekhari
    Bart Vandereycken
    Gilles Vilmart
    Konstantinos C. Zygalakis
    [J]. BIT Numerical Mathematics, 2021, 61 : 119 - 139
  • [2] Online Lazy Gradient Descent is Universal on Strongly Convex Domains
    Anderson, Daron
    Leith, Douglas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [3] Scaling up stochastic gradient descent for non-convex optimisation
    Saad Mohamad
    Hamad Alamri
    Abdelhamid Bouchachia
    [J]. Machine Learning, 2022, 111 : 4039 - 4079
  • [4] Scaling up stochastic gradient descent for non-convex optimisation
    Mohamad, Saad
    Alamri, Hamad
    Bouchachia, Abdelhamid
    [J]. MACHINE LEARNING, 2022, 111 (11) : 4039 - 4079
  • [5] Accelerated Distributed Nesterov Gradient Descent for Smooth and Strongly Convex Functions
    Qu, Guannan
    Li, Na
    [J]. 2016 54TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2016, : 209 - 216
  • [6] Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex Optimization
    Tao, Wei
    Li, Wei
    Pan, Zhisong
    Tao, Qing
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9843 - 9850
  • [7] On Faster Convergence of Cyclic Block Coordinate Descent-type Methods for Strongly Convex Minimization
    Li, Xingguo
    Zhao, Tuo
    Arora, Raman
    Liu, Han
    Hong, Mingyi
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [8] An explicit descent method for bilevel convex optimization
    Solodov, Mikhail
    [J]. JOURNAL OF CONVEX ANALYSIS, 2007, 14 (02) : 227 - 237
  • [9] On Faster Convergence of Scaled Sign Gradient Descent
    Li, Xiuxian
    Lin, Kuo-Yi
    Li, Li
    Hong, Yiguang
    Chen, Jie
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 1732 - 1741
  • [10] Faster Gradient Descent and the Efficient Recovery of Images
    Huang H.
    Ascher U.
    [J]. Vietnam Journal of Mathematics, 2014, 42 (1) : 115 - 131