Empirical risk minimization in the interpolating regime with application to neural network learning

被引：0

作者：

Muecke, Nicole ^{[1
]}

Steinwart, Ingo ^{[2
]}

机构：

[1] Tech Univ Carolo Wilhelmina Braunschweig, Inst Math Stochast, Braunschweig, Germany

[2] Univ Stuttgart, Inst Stochast & Applicat, Stuttgart, Germany

来源：

MACHINE LEARNING | 2025年 / 114卷 / 04期

关键词：

Neural network learning; Overparameterization; Interpolation; Empirical risk minimization; DEEP; CONVERGENCE;

D O I：

10.1007/s10994-025-06738-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A common strategy to train deep neural networks (DNNs) is to use very large architectures and to train them until they (almost) achieve zero training error. Empirically observed good generalization performance on test data, even in the presence of lots of label noise, corroborate such a procedure. On the other hand, in statistical learning theory it is known that over-fitting models may lead to poor generalization properties, occurring in e.g. empirical risk minimization (ERM) over too large hypotheses classes. Inspired by this contradictory behavior, so-called interpolation methods have recently received much attention, leading to consistent and optimally learning methods for, e.g., some local averaging schemes with zero training error. We extend this analysis to ERM-like methods for least squares regression and show that for certain, large hypotheses classes called inflated histograms, some interpolating empirical risk minimizers enjoy very good statistical guarantees while others fail in the worst sense. Moreover, we show that the same phenomenon occurs for DNNs with zero training error and sufficiently large architectures.

引用

页数：52

共 50 条

[31] Robust Empirical Risk Minimization with Tolerance
Bhattacharjee, Robi
Hopkins, Max
Kumar, Akash
Yu, Hantao
Chaudhuri, Kamalika
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 182 - 203
[32] On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks
Backurs, Arturs
Indyk, Piotr
Schmidt, Ludwig
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[33] Differentially Private Empirical Risk Minimization
Chaudhuri, Kamalika
Monteleoni, Claire
Sarwate, Anand D.
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1069 - 1109
[34] On Concentration for (Regularized) Empirical Risk Minimization
van de Geer S.
Wainwright M.J.
Sankhya A, 2017, 79 (2): : 159 - 200
[35] Aggregation via empirical risk minimization
Lecue, Guillaume
Mendelson, Shahar
PROBABILITY THEORY AND RELATED FIELDS, 2009, 145 (3-4) : 591 - 613
[36] Distributed Personalized Empirical Risk Minimization
Deng, Yuyang
Kamani, Mohammad Mahdi
Mahdavinia, Pouria
Mahdavi, Mehrdad
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[37] Fair Empirical Risk Minimization Revised
Franco, Danilo
Oneto, Luca
Anguita, Davide
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT I, 2023, 14134 : 29 - 42
[38] General Fair Empirical Risk Minimization
Oneto, Luca
Donini, Michele
Pontil, Massimiliano
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[39] On the Minimal Error of Empirical Risk Minimization
Kur, Gil
Rakhlin, Alexander
CONFERENCE ON LEARNING THEORY, VOL 134, 2021, 134
[40] Sparsity in penalized empirical risk minimization
Koltchinskii, Vladimir
ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2009, 45 (01): : 7 - 57

← 1 2 3 4 5 →