Best k-Layer Neural Network Approximations

被引:1
|
作者
Lim, Lek-Heng [1 ]
Michalek, Mateusz [2 ,3 ]
Qi, Yang [4 ]
机构
[1] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Konstanz, D-78457 Constance, Germany
[4] Ecole Polytech, INRIA Saclay Ile France, CMAP, IP Paris,CNRS, F-91128 Palaiseau, France
关键词
Neural network; Best approximation; Join loci; Secant loci;
D O I
10.1007/s00365-021-09545-2
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We show that the empirical risk minimization (ERM) problem for neural networks has no solution in general. Given a training set s(1), ..., s(n) is an element of R-p with corresponding responses t(1), ..., t(n) is an element of R-q, fitting a k-layer neural network v(theta) : R-p -> R-q involves estimation of the weights theta is an element of R-m via an ERM: inf(theta is an element of Rm)Sigma(n)(i=1)parallel to t(i) - v(theta)(s(i))parallel to(2)(2). We show that even for k = 2, this infimum is not attainable in general for common activations like ReLU, hyperbolic tangent, and sigmoid functions. In addition, we deduce that if one attempts to minimize such a loss function in the event when its infimum is not attainable, it necessarily results in values of theta diverging to +/-infinity. We will show that for smooth activations sigma(x) = 1/(1 + exp(-x)) and sigma(x) = tanh(x), such failure to attain an infimum can happen on a positive-measured subset of responses. For the ReLU activation sigma(x) = max(0, x), we completely classify cases where the ERM for a best two-layer neural network approximation attains its infimum. In recent applications of neural networks, where overfitting is commonplace, the failure to attain an infimum is avoided by ensuring that the system of equations t(i) = v(theta)(s(i)), i = 1, ..., n, has a solution. For a two-layer ReLU-activated network, we will show when such a system of equations has a solution generically, i.e., when can such a neural network be fitted perfectly with probability one.
引用
收藏
页码:583 / 604
页数:22
相关论文
共 50 条
  • [1] Best k-Layer Neural Network Approximations
    Lek-Heng Lim
    Mateusz Michałek
    Yang Qi
    Constructive Approximation, 2022, 55 : 583 - 604
  • [2] PERFORMANCE DRIVEN K-LAYER WIRING
    KAUFMANN, M
    MOLITOR, P
    VOGELGESANG, W
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 577 : 489 - 500
  • [3] Using sifting for k-layer straightline crossing minimization
    Matuszewski, C
    Schönfeld, R
    Molitor, P
    GRAPH DRAWING, 1999, 1731 : 217 - 224
  • [4] A New Biorthogonal Spline Wavelet-Based K-Layer Network for Underwater Image Enhancement
    Zhou, Dujuan
    Cai, Zhanchuan
    He, Dan
    MATHEMATICS, 2024, 12 (09)
  • [5] Quantum phase transitions in the K-layer Ising toric code
    Schamriss, Lukas
    Lenke, Lea
    Muehlhauser, Matthias
    Schmidt, Kai Phillip
    PHYSICAL REVIEW B, 2022, 105 (18)
  • [6] Equidistant k-layer multi-granularity knowledge space
    Duan, Jiangli
    Wang, Guoyin
    Hu, Xin
    KNOWLEDGE-BASED SYSTEMS, 2021, 234
  • [7] Adversarial Learning for Coordinate Regression Through k-Layer Penetrating Representation
    Jiang, Mengxi
    Sui, Yulei
    Lei, Yunqi
    Xie, Xiaofei
    Li, Cuihua
    Liu, Yang
    Tsang, Ivor W.
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (06) : 5538 - 5552
  • [8] An efficient algorithm for the split k-layer circular topological via minimization problem
    Huang, JS
    Chin, YH
    VLSI DESIGN, 1996, 4 (01) : 41 - 51
  • [9] Implementation of neural network with approximations functions
    Hnatiuc, M
    Lamarque, G
    SCS 2003: INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2003, : 553 - 556
  • [10] The Best Neural Network Architecture
    Kuri-Morales, Angel Fernando
    NATURE-INSPIRED COMPUTATION AND MACHINE LEARNING, PT II, 2014, 8857 : 72 - 84