Data-Dependent Stability of Stochastic Gradient Descent

被引:0
|
作者
Kuzborskij, Ilja [1 ]
Lampert, Christoph H. [2 ]
机构
[1] Univ Milan, Milan, Italy
[2] IST Austria, Klosterneuburg, Austria
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [31] Distributed Byzantine Tolerant Stochastic Gradient Descent in the Era of Big Data
    Jin, Richeng
    He, Xiaofan
    Dai, Huaiyu
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [32] RECENT TRENDS IN STOCHASTIC GRADIENT DESCENT FOR MACHINE LEARNING AND BIG DATA
    Newton, David
    Pasupathy, Raghu
    Yousefian, Farzad
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 366 - 380
  • [35] Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent
    Zhu, Lingjiong
    Gurbuzbalaban, Mert
    Raj, Anant
    Simsekli, Umut
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers
    Paquin, Alexandre Lemire
    Chaib-draa, Brahim
    Giguere, Philippe
    NEURAL NETWORKS, 2023, 164 : 382 - 394
  • [37] Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on Least Squares
    Raj, Anant
    Barsbey, Melih
    Gurbuzbalaban, Mert
    Zhu, Lingjiong
    Simsekli, Umut
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 1292 - 1342
  • [38] Online Projected Gradient Descent for Stochastic Optimization With Decision-Dependent Distributions
    Wood, Killian
    Bianchin, Gianluca
    Dall'Anese, Emiliano
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 1646 - 1651
  • [39] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [40] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961