Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks

被引:0
|
作者
Nguyen, Quynh [1 ]
Mondelli, Marco [2 ]
Montufar, Guido [1 ,3 ]
机构
[1] MPI MIS, Leipzig, Germany
[2] IST Austria, Klosterneuburg, Austria
[3] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been related to the memorization capacity, the global convergence of gradient descent algorithms and the generalization of deep nets. However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. In the finite-width setting, the network architectures we consider are fairly general: we require the existence of a wide layer with roughly order of N neurons, N being the number of data samples; and the scaling of the remaining layer widths is arbitrary (up to logarithmic factors). To obtain our results, we analyze various quantities of independent interest: we give lower bounds on the smallest singular value of hidden feature matrices, and upper bounds on the Lipschitz constant of input-output feature maps.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] GRAPH CONVOLUTIONAL NETWORKS FROM THE PERSPECTIVE OF SHEAVES AND THE NEURAL TANGENT KERNEL
    Gebhart, Thomas
    TOPOLOGICAL, ALGEBRAIC AND GEOMETRIC LEARNING WORKSHOPS 2022, VOL 196, 2022, 196
  • [42] Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel
    Richards, Dominic
    Kuzborskij, Ilja
    Advances in Neural Information Processing Systems, 2021, 11 : 8609 - 8621
  • [43] Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel
    Richards, Dominic
    Kuzborskij, Ilja
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [44] A Kernel Perspective for Regularizing Deep Neural Networks
    Bietti, Alberto
    Mialon, Gregoire
    Chen, Dexiong
    Mairal, Julien
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [45] Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel Perspective
    Chu, Lei
    Pan, Hao
    Wang, Wenping
    ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (03):
  • [46] Deep ReLU neural networks overcome the curse of dimensionality for partial integrodifferential equations
    Gonon, Lukas
    Schwab, Christoph
    ANALYSIS AND APPLICATIONS, 2023, 21 (01) : 1 - 47
  • [47] Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks
    Liao, Fangshuo
    Kyrillidis, Anastasios
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 237, 2024, 237
  • [48] Optimal approximation of piecewise smooth functions using deep ReLU neural networks
    Petersen, Philipp
    Voigtlaender, Felix
    NEURAL NETWORKS, 2018, 108 : 296 - 330
  • [49] Deep convolutional neural networks for eigenvalue problems in mechanics
    Finol, David
    Lu, Yan
    Mahadevan, Vijay
    Srivastava, Ankit
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2019, 118 (05) : 258 - 275
  • [50] Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks
    Cao, Yuan
    Gu, Quanquan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3349 - 3356