Data-Dependent Stability of Stochastic Gradient Descent

被引:0
|
作者
Kuzborskij, Ilja [1 ]
Lampert, Christoph H. [2 ]
机构
[1] Univ Milan, Milan, Italy
[2] IST Austria, Klosterneuburg, Austria
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [22] Stochastic gradient descent tricks
    Bottou, Léon
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
  • [23] Stochastic Reweighted Gradient Descent
    El Hanchi, Ayoub
    Stephens, David A.
    Maddison, Chris J.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [24] Byzantine Stochastic Gradient Descent
    Alistarh, Dan
    Allen-Zhu, Zeyuan
    Li, Jerry
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
    Lei, Yunwen
    Ying, Yiming
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [26] Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
    Wang, Jiahuan
    Chen, Hong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15511 - 15519
  • [27] Fully Empirical and Data-Dependent Stability-Based Bounds
    Oneto, Luca
    Ghio, Alessandro
    Ridella, Sandro
    Anguita, Davide
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) : 1913 - 1926
  • [28] A Continuous-time Stochastic Gradient Descent Method for Continuous Data
    Jin, Kexin
    Latz, Jonas
    Liu, Chenguang
    Schonlieb, Carola-Bibiane
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [29] Evaluation of Stochastic Gradient Descent Methods for Nonlinear Mapping of Hyperspectral Data
    Myasnikov, Evgeny
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2016), 2016, 9730 : 276 - 283
  • [30] Asynchronous Peer-to-Peer Data Mining with Stochastic Gradient Descent
    Ormandi, Robert
    Hegedus, Istvan
    Jelasity, Mark
    EURO-PAR 2011 PARALLEL PROCESSING, PT 1, 2011, 6852 : 528 - 540