On the Provable Generalization of Recurrent Neural Networks

被引：0

作者：

Wang, Lifu ^{[1
]}

Shen, Bo ^{[1
]}

Hu, Bo ^{[1
]}

Cao, Xing ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: (1) For a RNN with input sequence x = (X-1, X-2,..., X-L), previous works study to learn functions that are summation of f(ss(T)(l) X-l) and require normalized conditions that ||X-l|| <= epsilon with some very small. depending on the complexity of f. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length L. (2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form f(ss(T) [X-l1,..., X-lN]), which do not belong to the "additive" concept class, i,e., the summation of function f(X-l). And we show that when either N or l(0) = max(l(1),.., l(N)) - min(l(1),.., l(N)) is small, f(ss(T) [X-l1,..., X-lN]) will be learnable with the number iterations and samples scaling almost-polynomially in the input length L.

引用

下载

页数：12

共 50 条

[21] On temporal generalization of simple recurrent networks
Wang, DL
Liu, XM
Ahalt, SC
NEURAL NETWORKS, 1996, 9 (07) : 1099 - 1118
[22] Sensitivity-Informed Provable Pruning of Neural Networks
Baykal, Cenk
Liebenwein, Lucas
Gilitschenski, Igor
Feldman, Dan
Rus, Daniela
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (01): : 26 - 45
[23] Provable Preimage Under-Approximation for Neural Networks
Zhang, Xiyue
Wang, Benjie
Kwiatkowska, Marta
TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PT III, TACAS 2024, 2024, 14572 : 3 - 23
[24] Towards Effective Training of Robust Spiking Recurrent Neural Networks under General Input Noise via Provable Analysis
Zheng, Wendong
Zhou, Yu
Chen, Gang
Gu, Zonghua
Huang, Kai
2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
[25] Recurrent neural networks
Siegelmann, HT
COMPUTER SCIENCE TODAY, 1995, 1000 : 29 - 45
[26] Provable Guarantees for Neural Networks via Gradient Feature Learning
Shi, Zhenmei
Wei, Junyi
Liang, Yingyu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[27] Architecture-Preserving Provable Repair of Deep Neural Networks
Tao, Zhe
Nawas, Stephanie
Mitchell, Jacqueline
Thakur, Aditya V.
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (PLDI):
[28] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
Sebastian Bitzer
Stefan J. Kiebel
Biological Cybernetics, 2012, 106 : 201 - 217
[29] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
Bitzer, Sebastian
Kiebel, Stefan J.
BIOLOGICAL CYBERNETICS, 2012, 106 (4-5) : 201 - 217
[30] PROVABLE TRANSLATIONAL ROBUSTNESS FOR OBJECT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
Vierling, Axel
James, Charu
Berns, Karsten
Katsaouni, Nikoletta
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 694 - 698

← 1 2 3 4 5 →