Empirical risk minimization in the interpolating regime with application to neural network learning

被引:0
|
作者
Muecke, Nicole [1 ]
Steinwart, Ingo [2 ]
机构
[1] Tech Univ Carolo Wilhelmina Braunschweig, Inst Math Stochast, Braunschweig, Germany
[2] Univ Stuttgart, Inst Stochast & Applicat, Stuttgart, Germany
关键词
Neural network learning; Overparameterization; Interpolation; Empirical risk minimization; DEEP; CONVERGENCE;
D O I
10.1007/s10994-025-06738-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common strategy to train deep neural networks (DNNs) is to use very large architectures and to train them until they (almost) achieve zero training error. Empirically observed good generalization performance on test data, even in the presence of lots of label noise, corroborate such a procedure. On the other hand, in statistical learning theory it is known that over-fitting models may lead to poor generalization properties, occurring in e.g. empirical risk minimization (ERM) over too large hypotheses classes. Inspired by this contradictory behavior, so-called interpolation methods have recently received much attention, leading to consistent and optimally learning methods for, e.g., some local averaging schemes with zero training error. We extend this analysis to ERM-like methods for least squares regression and show that for certain, large hypotheses classes called inflated histograms, some interpolating empirical risk minimizers enjoy very good statistical guarantees while others fail in the worst sense. Moreover, we show that the same phenomenon occurs for DNNs with zero training error and sufficiently large architectures.
引用
收藏
页数:52
相关论文
共 50 条
  • [21] Genetic algorithm optimize neural network based on Structural Risk Minimization
    Fan, JS
    Tao, Q
    Fang, TJ
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 948 - 952
  • [22] Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization
    Owusu-Agyemang, Kwabena
    Qin, Zhen
    Benjamin, Appiah
    Xiong, Hu
    Qin, Zhiguang
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (04) : 4772 - 4796
  • [23] Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization
    Jayaraman, Bargav
    Wang, Lingxiao
    Evans, David
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [24] Minimization of Empirical Risk as a Means of Choosing the Number of Hypotheses in Algebraic Machine Learning
    Vinogradov, D. V.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 525 - 528
  • [25] Minimization of Empirical Risk as a Means of Choosing the Number of Hypotheses in Algebraic Machine Learning
    D. V. Vinogradov
    Pattern Recognition and Image Analysis, 2023, 33 : 525 - 528
  • [26] On Graph Reconstruction via Empirical Risk Minimization: Fast Learning Rates and Scalability
    Papa, Guillaume
    Clemencon, Stephan
    Bellet, Aurelien
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [27] A recurrent fuzzy neural network: Learning and application
    Ballini, R
    Gomide, F
    VII BRAZILIAN SYMPOSIUM ON NEURAL NETWORKS, PROCEEDINGS, 2002, : 153 - 153
  • [28] Local complexities for empirical risk minimization
    Bartlett, PL
    Mendelson, S
    Philips, P
    LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 270 - 284
  • [29] Aggregation via empirical risk minimization
    Guillaume Lecué
    Shahar Mendelson
    Probability Theory and Related Fields, 2009, 145 : 591 - 613
  • [30] EMPIRICAL RISK MINIMIZATION IN INVERSE PROBLEMS
    Klemela, Jussi
    Mammen, Enno
    ANNALS OF STATISTICS, 2010, 38 (01): : 482 - 511