Sharp asymptotics on the compression of two-layer neural networks

被引:0
|
作者
Amani, Mohammad Hossein [1 ]
Bombari, Simone [2 ]
Mondelli, Marco [2 ]
Pukdee, Rattana [3 ]
Rini, Stefano [4 ]
机构
[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[2] IST Austria, Klosterneuburg, Austria
[3] CMU, Pittsburgh, PA USA
[4] NYCU, Hsinchu, Taiwan
关键词
D O I
10.1109/ITW54588.2022.9965870
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M < N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L-2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools from high-dimensional probability, we show that this non-convex problem can be simplified when the target network is sufficiently over-parameterized, and provide the error rate of this approximation as a function of the input dimension and N. In this mean-field limit, the simplified objective, as well as the optimal weights of the compressed network, does not depend on the realization of the target network, but only on expected scaling factors. Furthermore, for networks with ReLU activation, we conjecture that the optimum of the simplified optimization problem is achieved by taking weights on the Equiangular Tight Frame (ETF), while the scaling of the weights and the orientation of the ETF depend on the parameters of the target network. Numerical evidence is provided to support this conjecture.
引用
下载
收藏
页码:588 / 593
页数:6
相关论文
共 50 条
  • [41] Use of two-layer neural networks to answer scientific questions in radiation oncology
    Schmelz, Helmut
    Eich, Hans Theodor
    Haverkamp, Uwe
    Rehn, Stephan
    Hering, Dominik
    STRAHLENTHERAPIE UND ONKOLOGIE, 2023, 199 : S66 - S66
  • [42] Cumulant-based training algorithms of two-layer feedforward neural networks
    Dai, XH
    SIGNAL PROCESSING, 2000, 80 (08) : 1597 - 1606
  • [43] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
    Tan, Zhenghao
    Chen, Songcan
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (03)
  • [44] Improved learning algorithm for two-layer neural networks for identification of nonlinear systems
    Vargas, Jose A. R.
    Pedrycz, Witold
    Hemerly, Elder M.
    NEUROCOMPUTING, 2019, 329 : 86 - 96
  • [45] Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks
    Liu, Fanghui
    Dadi, Leello
    Cevher, Volkan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 42
  • [46] A Riemannian mean field formulation for two-layer neural networks with batch normalization
    Chao Ma
    Lexing Ying
    Research in the Mathematical Sciences, 2022, 9
  • [47] A Riemannian mean field formulation for two-layer neural networks with batch normalization
    Ma, Chao
    Ying, Lexing
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2022, 9 (03)
  • [48] A heuristic two-layer reinforcement learning algorithm based on BP neural networks
    Liu, Zhibin
    Zeng, Xiaoqin
    Liu, Huiyi
    Chu, Rong
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (03): : 579 - 587
  • [49] Dynamics of the two-layer pseudoinverse neural network
    黎树军
    黄五群
    陈天仑
    Science Bulletin, 1995, (20) : 1691 - 1694
  • [50] SOLUTION OF THE PROBLEM OF COMPRESSION OF A TWO-LAYER NONLINEAR MATERIAL
    Senashov, S. I.
    Savost'yanova, I. L.
    JOURNAL OF APPLIED MECHANICS AND TECHNICAL PHYSICS, 2023, 64 (04) : 712 - 714