On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

被引:0
|
作者
Li, Yicheng [1 ]
Zhang, Haobo [1 ]
Lin, Qian [1 ]
机构
[1] Tsinghua Univ, Ctr Stat Sci, Dept Ind Engn, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning curve) of kernel ridge regression attracts increasing attention recently. However, most recent arguments on the learning curve are heuristic and are based on the 'Gaussian design' assumption. In this paper, under mild and more realistic assumptions, we rigorously provide a full characterization of the learning curve in the asymptotic sense under a power-law decay condition of the eigenvalues of the kernel and also the target function. The learning curve elaborates the effect and the interplay of the choice of the regularization parameter, the source condition and the noise. In particular, our results suggest that the 'benign overfitting phenomenon' exists in over-parametrized neural networks only when the noise level is small.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] On the power-law indenter used in the analysis of nanoindentation unloading curves
    Fu, GH
    Cao, TS
    [J]. JOURNAL OF MATERIALS SCIENCE, 2005, 40 (9-10) : 2619 - 2620
  • [32] A SIMPLE PROCEDURE TO DETERMINE POWER-LAW CURVES - THE FLAG VISCOMETER
    CASTELLPEREZ, ME
    MOREIRA, RG
    STEFFE, JF
    [J]. REVISTA ESPANOLA DE CIENCIA Y TECNOLOGIA DE ALIMENTOS, 1993, 33 (05): : 529 - 547
  • [33] Power-law decay vs. fractal complexity of EEG
    Stolc, S
    Krakovská, A
    [J]. Measurement 2005, Proceedings, 2005, : 155 - 159
  • [34] Power-law Decay and the Ergodic–Nonergodic Transition in Simple Fluids
    Paul Spyridis
    Gene F. Mazenko
    [J]. Journal of Statistical Physics, 2014, 154 : 1030 - 1056
  • [35] Power-law exponent in the transition period of decay in grid turbulence
    Djenidi, L.
    Kamruzzaman, Md.
    Antonia, R. A.
    [J]. JOURNAL OF FLUID MECHANICS, 2015, 779 : 544 - 555
  • [36] Power-law decay in first-order relaxation processes
    Fondado, A
    Mira, J
    Rivas, J
    [J]. PHYSICAL REVIEW B, 2005, 72 (02):
  • [37] Universal power-law decay of the impulse energy in granular protectors
    Hong, J
    [J]. PHYSICAL REVIEW LETTERS, 2005, 94 (10)
  • [38] POWER-LAW DECAY OF CONDUCTANCE DURING THE DRYING OF LATEX PAINTS
    DISSADO, LA
    GREEN, PW
    HILL, RM
    STRIVENS, TA
    [J]. JOURNAL OF PHYSICS D-APPLIED PHYSICS, 1989, 22 (05) : 713 - 716
  • [39] Learning Structure of Power-Law Markov Networks
    Das, Abhik Kumar
    Netrapalli, Praneeth
    Sanghavi, Sujay
    Vishwanath, Sriram
    [J]. 2014 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2014, : 2272 - 2276
  • [40] Power-law decay of the view times of scientific courses on YouTube
    Gao, Lingling
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2012, 391 (22) : 5697 - 5703