Data-Dependent Stability of Stochastic Gradient Descent

被引:0
|
作者
Kuzborskij, Ilja [1 ]
Lampert, Christoph H. [2 ]
机构
[1] Univ Milan, Milan, Italy
[2] IST Austria, Klosterneuburg, Austria
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [1] Data-Dependent Bounds on Network Gradient Descent
    Bijral, Avleen
    Sarwate, Anand D.
    Srebro, Nathan
    2016 54TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2016, : 869 - 874
  • [2] Towards stability and optimality in stochastic gradient descent
    Toulis, Panos
    Tran, Dustin
    Airoldi, Edoardo M.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 1290 - 1298
  • [3] Stability and Generalization of Decentralized Stochastic Gradient Descent
    Sun, Tao
    Li, Dongsheng
    Wang, Bao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9756 - 9764
  • [4] Global Convergence and Stability of Stochastic Gradient Descent
    Patel, Vivak
    Zhang, Shushu
    Tian, Bowen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [5] Data-Dependent Convergence for Consensus Stochastic Optimization
    Bijral, Avleen S.
    Sarwate, Anand D.
    Srebro, Nathan
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (09) : 4483 - 4498
  • [6] On the stability of the stochastic gradient Langevin algorithm with dependent data stream
    Rasonyi, Miklos
    Tikosi, Kinga
    STATISTICS & PROBABILITY LETTERS, 2022, 182
  • [7] Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
    Bassily, Raef
    Feldman, Vitaly
    Guzman, Cristobal
    Talwar, Kunal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] AdaDB: An adaptive gradient method with data-dependent bound
    Yang, Liu
    Cai, Deng
    NEUROCOMPUTING, 2021, 419 (419) : 183 - 189
  • [9] Data-dependent stability analysis of adversarial training
    Wang, Yihan
    Liu, Shuang
    Gao, Xiao-Shan
    Neural Networks, 2025, 183
  • [10] Stochastic Gradient Descent for Linear Systems with Missing Data
    Ma, Anna
    Needell, Deanna
    NUMERICAL MATHEMATICS-THEORY METHODS AND APPLICATIONS, 2019, 12 (01) : 1 - 20