Fast and Provable Algorithms for Learning Two-Layer Polynomial Neural Networks

被引:3
|
作者
Soltani, Mohammadreza [1 ]
Hegde, Chinmay [1 ]
机构
[1] Iowa State Univ, Elect & Comp Engn Dept, Ames, IA 50010 USA
基金
美国国家科学基金会;
关键词
Deep learning; low-rank estimation; approximate algorithms; sample complexity; rank-one projection; PHASE RETRIEVAL; RECOVERY;
D O I
10.1109/TSP.2019.2916743
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we bridge the problem of (provably) learning shallow neural networks with the well-studied problem of low-rank matrix estimation. In particular, we consider two-layer networks with quadratic activations, and focus on the under-parameterized regime where the number of neurons in the hidden layer is smaller than the dimension of the input. Our main approach is to "lift" the learning problem into a higher dimension, which enables us to borrow algorithmic techniques from low-rank matrix estimation. Using this intuition, we propose three novel, non-convex training algorithms. We support our algorithms with rigorous theoretical analysis, and show that all three enjoy a linear convergence, fast running time per iteration, and near-optimal sample complexity. Finally, we complement our theoretical results with numerical experiments.
引用
收藏
页码:3361 / 3371
页数:11
相关论文
共 50 条
  • [1] Plasticity of two-layer fast neural networks
    Alexeev, AA
    Dorogov, AY
    [J]. JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 1999, 38 (05) : 786 - 791
  • [2] Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
    Taheri, Hossein
    Thrampoulidis, Christos
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9944 - 9952
  • [3] Templates and algorithms for two-layer cellular neural Networks
    Yang, ZH
    Nishio, Y
    Ushida, A
    [J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 1946 - 1951
  • [4] Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization
    Li G.
    Wang G.
    Ding J.
    [J]. IEEE Transactions on Information Theory, 2023, 69 (09) : 5921 - 5935
  • [5] Structural synthesis of fast two-layer neural networks
    A. Yu. Dorogov
    [J]. Cybernetics and Systems Analysis, 2000, 36 : 512 - 519
  • [6] Structural synthesis of fast two-layer neural networks
    Dorogov, AY
    [J]. CYBERNETICS AND SYSTEMS ANALYSIS, 2000, 36 (04) : 512 - 519
  • [7] Learning behavior and temporary minima of two-layer neural networks
    Annema, Anne-Johan
    Hoen, Klaas
    Wallinga, Hans
    [J]. Neural Networks, 1994, 7 (09): : 1387 - 1404
  • [8] Learning Two Layer Rectified Neural Networks in Polynomial Time
    Bakshi, Ainesh
    Jayaram, Rajesh
    Woodruff, David P.
    [J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [9] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
    Zhenghao TAN
    Songcan CHEN
    [J]. Frontiers of Computer Science, 2022, 16 (03) : 80 - 85
  • [10] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
    Zhenghao Tan
    Songcan Chen
    [J]. Frontiers of Computer Science, 2022, 16