Empirical risk minimization in the interpolating regime with application to neural network learning

被引:0
|
作者
Muecke, Nicole [1 ]
Steinwart, Ingo [2 ]
机构
[1] Tech Univ Carolo Wilhelmina Braunschweig, Inst Math Stochast, Braunschweig, Germany
[2] Univ Stuttgart, Inst Stochast & Applicat, Stuttgart, Germany
关键词
Neural network learning; Overparameterization; Interpolation; Empirical risk minimization; DEEP; CONVERGENCE;
D O I
10.1007/s10994-025-06738-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common strategy to train deep neural networks (DNNs) is to use very large architectures and to train them until they (almost) achieve zero training error. Empirically observed good generalization performance on test data, even in the presence of lots of label noise, corroborate such a procedure. On the other hand, in statistical learning theory it is known that over-fitting models may lead to poor generalization properties, occurring in e.g. empirical risk minimization (ERM) over too large hypotheses classes. Inspired by this contradictory behavior, so-called interpolation methods have recently received much attention, leading to consistent and optimally learning methods for, e.g., some local averaging schemes with zero training error. We extend this analysis to ERM-like methods for least squares regression and show that for certain, large hypotheses classes called inflated histograms, some interpolating empirical risk minimizers enjoy very good statistical guarantees while others fail in the worst sense. Moreover, we show that the same phenomenon occurs for DNNs with zero training error and sufficiently large architectures.
引用
收藏
页数:52
相关论文
共 50 条
  • [31] Robust Empirical Risk Minimization with Tolerance
    Bhattacharjee, Robi
    Hopkins, Max
    Kumar, Akash
    Yu, Hantao
    Chaudhuri, Kamalika
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 182 - 203
  • [32] On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks
    Backurs, Arturs
    Indyk, Piotr
    Schmidt, Ludwig
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [33] Differentially Private Empirical Risk Minimization
    Chaudhuri, Kamalika
    Monteleoni, Claire
    Sarwate, Anand D.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1069 - 1109
  • [34] On Concentration for (Regularized) Empirical Risk Minimization
    van de Geer S.
    Wainwright M.J.
    Sankhya A, 2017, 79 (2): : 159 - 200
  • [35] Aggregation via empirical risk minimization
    Lecue, Guillaume
    Mendelson, Shahar
    PROBABILITY THEORY AND RELATED FIELDS, 2009, 145 (3-4) : 591 - 613
  • [36] Distributed Personalized Empirical Risk Minimization
    Deng, Yuyang
    Kamani, Mohammad Mahdi
    Mahdavinia, Pouria
    Mahdavi, Mehrdad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [37] Fair Empirical Risk Minimization Revised
    Franco, Danilo
    Oneto, Luca
    Anguita, Davide
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT I, 2023, 14134 : 29 - 42
  • [38] General Fair Empirical Risk Minimization
    Oneto, Luca
    Donini, Michele
    Pontil, Massimiliano
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [39] On the Minimal Error of Empirical Risk Minimization
    Kur, Gil
    Rakhlin, Alexander
    CONFERENCE ON LEARNING THEORY, VOL 134, 2021, 134
  • [40] Sparsity in penalized empirical risk minimization
    Koltchinskii, Vladimir
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2009, 45 (01): : 7 - 57