Sharp asymptotics on the compression of two-layer neural networks

被引:0
|
作者
Amani, Mohammad Hossein [1 ]
Bombari, Simone [2 ]
Mondelli, Marco [2 ]
Pukdee, Rattana [3 ]
Rini, Stefano [4 ]
机构
[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[2] IST Austria, Klosterneuburg, Austria
[3] CMU, Pittsburgh, PA USA
[4] NYCU, Hsinchu, Taiwan
关键词
D O I
10.1109/ITW54588.2022.9965870
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M < N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L-2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools from high-dimensional probability, we show that this non-convex problem can be simplified when the target network is sufficiently over-parameterized, and provide the error rate of this approximation as a function of the input dimension and N. In this mean-field limit, the simplified objective, as well as the optimal weights of the compressed network, does not depend on the realization of the target network, but only on expected scaling factors. Furthermore, for networks with ReLU activation, we conjecture that the optimum of the simplified optimization problem is achieved by taking weights on the Equiangular Tight Frame (ETF), while the scaling of the weights and the orientation of the ETF depend on the parameters of the target network. Numerical evidence is provided to support this conjecture.
引用
下载
收藏
页码:588 / 593
页数:6
相关论文
共 50 条
  • [31] Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
    Taheri, Hossein
    Thrampoulidis, Christos
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9944 - 9952
  • [32] Fast and Provable Algorithms for Learning Two-Layer Polynomial Neural Networks
    Soltani, Mohammadreza
    Hegde, Chinmay
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (13) : 3361 - 3371
  • [33] Synchronizability of two-layer correlation networks
    Wei, Xiang
    Wu, Xiaoqun
    Lu, Jun-An
    Wei, Juan
    Zhao, Junchan
    Wang, Yisi
    CHAOS, 2021, 31 (10)
  • [34] Untrusted Caches in Two-layer Networks
    Zewail, Ahmed A.
    Yener, Aylin
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 1 - 5
  • [35] Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization
    Li G.
    Wang G.
    Ding J.
    IEEE Transactions on Information Theory, 2023, 69 (09) : 5921 - 5935
  • [36] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
    TAN Zhenghao
    CHEN Songcan
    Frontiers of Computer Science, 2022, 16 (03)
  • [37] Cumulant-based blind identification of two-layer feedforward neural networks
    Dai, Xianhua
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2002, 24 (01):
  • [38] New method of training two-layer sigmoid neural networks using regularization
    Krutikov, V. N.
    Kazakovtsev, L. A.
    Shkaberina, G. Sh
    Kazakovtsev, V. L.
    INTERNATIONAL WORKSHOP ADVANCED TECHNOLOGIES IN MATERIAL SCIENCE, MECHANICAL AND AUTOMATION ENGINEERING - MIP: ENGINEERING - 2019, 2019, 537
  • [39] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
    Zhenghao Tan
    Songcan Chen
    Frontiers of Computer Science, 2022, 16
  • [40] Self-Regularity of Output Weights for Overparameterized Two-Layer Neural Networks
    Gamarnik, David
    Kizildag, Eren C.
    Zadik, Ilias
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 819 - 824