Fast and Provable Algorithms for Learning Two-Layer Polynomial Neural Networks

被引：3

作者：

Soltani, Mohammadreza ^{[1
]}

Hegde, Chinmay ^{[1
]}

机构：

[1] Iowa State Univ, Elect & Comp Engn Dept, Ames, IA 50010 USA

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2019年 / 67卷 / 13期

基金：

美国国家科学基金会;

关键词：

Deep learning; low-rank estimation; approximate algorithms; sample complexity; rank-one projection; PHASE RETRIEVAL; RECOVERY;

D O I：

10.1109/TSP.2019.2916743

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we bridge the problem of (provably) learning shallow neural networks with the well-studied problem of low-rank matrix estimation. In particular, we consider two-layer networks with quadratic activations, and focus on the under-parameterized regime where the number of neurons in the hidden layer is smaller than the dimension of the input. Our main approach is to "lift" the learning problem into a higher dimension, which enables us to borrow algorithmic techniques from low-rank matrix estimation. Using this intuition, we propose three novel, non-convex training algorithms. We support our algorithms with rigorous theoretical analysis, and show that all three enjoy a linear convergence, fast running time per iteration, and near-optimal sample complexity. Finally, we complement our theoretical results with numerical experiments.

引用

页码：3361 / 3371

页数：11

共 50 条

[1] Plasticity of two-layer fast neural networks
Alexeev, AA
Dorogov, AY
[J]. JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 1999, 38 (05) : 786 - 791
[2] Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Taheri, Hossein
Thrampoulidis, Christos
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9944 - 9952
[3] Templates and algorithms for two-layer cellular neural Networks
Yang, ZH
Nishio, Y
Ushida, A
[J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 1946 - 1951
[4] Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization
Li G.
Wang G.
Ding J.
[J]. IEEE Transactions on Information Theory, 2023, 69 (09) : 5921 - 5935
[5] Structural synthesis of fast two-layer neural networks
A. Yu. Dorogov
[J]. Cybernetics and Systems Analysis, 2000, 36 : 512 - 519
[6] Structural synthesis of fast two-layer neural networks
Dorogov, AY
[J]. CYBERNETICS AND SYSTEMS ANALYSIS, 2000, 36 (04) : 512 - 519
[7] Learning behavior and temporary minima of two-layer neural networks
Annema, Anne-Johan
Hoen, Klaas
Wallinga, Hans
[J]. Neural Networks, 1994, 7 (09): : 1387 - 1404
[8] Learning Two Layer Rectified Neural Networks in Polynomial Time
Bakshi, Ainesh
Jayaram, Rajesh
Woodruff, David P.
[J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[9] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
Zhenghao TAN
Songcan CHEN
[J]. Frontiers of Computer Science, 2022, 16 (03) : 80 - 85
[10] On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
Zhenghao Tan
Songcan Chen
[J]. Frontiers of Computer Science, 2022, 16

← 1 2 3 4 5 →