Generalization Bounds for Stochastic Gradient Descent via Localized ε-Covers

被引:0
|
作者
Park, Sejun [1 ]
Simsekli, Umut [2 ]
Erdogdu, Murat A. [3 ,4 ]
机构
[1] Korea Univ, Seoul, South Korea
[2] Univ PSL, DI ENS, Ecole Normale Super, CNRS,INRIA, Paris, France
[3] Univ Toronto, Toronto, ON, Canada
[4] Vector Inst, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 欧洲研究理事会; 新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with P pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by O(root(log n log(nP))/n), where n is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and K-means clustering for both hard and soft label setups, improving the known state-of-the-art rates.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Generalization Bounds for Label Noise Stochastic Gradient Descent
    Huh, Jung Eun
    Rebeschini, Patrick
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [2] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
    Wang, Jiahuan
    Chen, Hong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15511 - 15519
  • [4] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [5] Stability and Generalization of Decentralized Stochastic Gradient Descent
    Sun, Tao
    Li, Dongsheng
    Wang, Bao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9756 - 9764
  • [6] Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
    Haghifam, Mahdi
    Rodriguez-Galvez, Borja
    Thobaben, Ragnar
    Skoglund, Mikael
    Roy, Daniel M.
    Dziugaite, Gintare Karolina
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 663 - 706
  • [7] Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
    Zhu, Miaoxi
    Shen, Li
    Du, Bo
    Tao, Dacheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
    Lei, Yunwen
    Ying, Yiming
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [9] The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
    Park, Daniel S.
    Sohl-Dickstein, Jascha
    Le, Quoc, V
    Smith, Samuel L.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] Stochastic Approximate Gradient Descent via the Langevin Algorithm
    Qiu, Yixuan
    Wang, Xiao
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5428 - 5435