Strong error analysis for stochastic gradient descent optimization algorithms

被引:13
|
作者
Jentzen, Arnulf [1 ]
Kuckuck, Benno [1 ]
Neufeld, Ariel [2 ]
von Wurstemberger, Philippe [3 ]
机构
[1] Univ Munster, Fac Math & Comp Sci, D-48149 Munster, Germany
[2] NTU Singapore, Div Math Sci, Singapore 637371, Singapore
[3] Swiss Fed Inst Technol, Dept Math, CH-8092 Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Stochastic gradient descent; Stochastic approximation algorithms; Strong error analysis; CONVERGENCE RATE; ROBBINS-MONRO; APPROXIMATION; MOMENTS; RATES;
D O I
10.1093/imanum/drz055
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small epsilon is an element of (0,infinity) and every arbitrarily large p epsilon (0,infinity) that the considered SGD optimization algorithm converges in the strong L-p-sense with order 1/2-epsilon to the global minimum of the objective function of the considered stochastic optimization problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large p epsilon (0,infinity) strong L-p-convergence rates.
引用
收藏
页码:455 / 492
页数:38
相关论文
共 50 条
  • [31] Wireless Network Optimization via Stochastic Sub-gradient Descent: Rate Analysis
    Bedi, Amrit Singh
    Rajawat, Ketan
    [J]. 2018 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2018,
  • [32] ON DISTRIBUTED STOCHASTIC GRADIENT ALGORITHMS FOR GLOBAL OPTIMIZATION
    Swenson, Brian
    Sridhar, Anirudh
    Poor, H. Vincent
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8594 - 8598
  • [33] A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks
    Dogo, E. M.
    Afolabi, O. J.
    Nwulu, N. I.
    Twala, B.
    Aigbavboa, C. O.
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES, ELECTRONICS AND MECHANICAL SYSTEMS (CTEMS), 2018, : 92 - 99
  • [34] Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
    Vakili, Sattar
    Salgia, Sudeep
    Zhao, Qing
    [J]. 2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 432 - 438
  • [35] STOCHASTIC GRADIENT DESCENT ALGORITHM FOR STOCHASTIC OPTIMIZATION IN SOLVING ANALYTIC CONTINUATION PROBLEMS
    Bao, Feng
    Maier, Thomas
    [J]. FOUNDATIONS OF DATA SCIENCE, 2020, 2 (01): : 1 - 17
  • [36] The Minimization of Empirical Risk Through Stochastic Gradient Descent with Momentum Algorithms
    Chaudhuri, Arindam
    [J]. ARTIFICIAL INTELLIGENCE METHODS IN INTELLIGENT ALGORITHMS, 2019, 985 : 168 - 181
  • [37] Robust Pose Graph Optimization Using Stochastic Gradient Descent
    Wang, John
    Olson, Edwin
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 4284 - 4289
  • [38] An Efficient Preconditioner for Stochastic Gradient Descent Optimization of Image Registration
    Qiao, Yuchuan
    Lelieveldt, Boudewijn P. F.
    Staring, Marius
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (10) : 2314 - 2325
  • [39] Stochastic gradient descent for hybrid quantum-classical optimization
    Sweke, Ryan
    Wilde, Frederik
    Meyer, Johannes Jakob
    Schuld, Maria
    Faehrmann, Paul K.
    Meynard-Piganeau, Barthelemy
    Eisert, Jens
    [J]. QUANTUM, 2020, 4
  • [40] The combination of particle swarm optimization and stochastic gradient descent with momentum
    Chen, Chi-Hua
    [J]. ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2022, 18 : 132 - 132