Data-Dependent Stability of Stochastic Gradient Descent

被引:0
|
作者
Kuzborskij, Ilja [1 ]
Lampert, Christoph H. [2 ]
机构
[1] Univ Milan, Milan, Italy
[2] IST Austria, Klosterneuburg, Austria
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [41] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [42] On the Hyperparameters in Stochastic Gradient Descent with Momentum
    Shi, Bin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [43] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [44] Graph Drawing by Stochastic Gradient Descent
    Zheng, Jonathan X.
    Pawar, Samraat
    Goodman, Dan F. M.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (09) : 2738 - 2748
  • [45] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [46] BACKPROPAGATION AND STOCHASTIC GRADIENT DESCENT METHOD
    AMARI, S
    NEUROCOMPUTING, 1993, 5 (4-5) : 185 - 196
  • [47] Nonparametric Budgeted Stochastic Gradient Descent
    Trung Le
    Vu Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 564 - 572
  • [48] On the discrepancy principle for stochastic gradient descent
    Jahn, Tim
    Jin, Bangti
    INVERSE PROBLEMS, 2020, 36 (09)
  • [49] Benign Underfitting of Stochastic Gradient Descent
    Koren, Tomer
    Livni, Roi
    Mansour, Yishay
    Sherman, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [50] The effective noise of stochastic gradient descent
    Mignacco, Francesca
    Urbani, Pierfrancesco
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):