On the Provable Generalization of Recurrent Neural Networks

被引:0
|
作者
Wang, Lifu [1 ]
Shen, Bo [1 ]
Hu, Bo [1 ]
Cao, Xing [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: (1) For a RNN with input sequence x = (X-1, X-2,..., X-L), previous works study to learn functions that are summation of f(ss(T)(l) X-l) and require normalized conditions that ||X-l|| <= epsilon with some very small. depending on the complexity of f. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length L. (2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form f(ss(T) [X-l1,..., X-lN]), which do not belong to the "additive" concept class, i,e., the summation of function f(X-l). And we show that when either N or l(0) = max(l(1),.., l(N)) - min(l(1),.., l(N)) is small, f(ss(T) [X-l1,..., X-lN]) will be learnable with the number iterations and samples scaling almost-polynomially in the input length L.
引用
下载
收藏
页数:12
相关论文
共 50 条
  • [21] On temporal generalization of simple recurrent networks
    Wang, DL
    Liu, XM
    Ahalt, SC
    NEURAL NETWORKS, 1996, 9 (07) : 1099 - 1118
  • [22] Sensitivity-Informed Provable Pruning of Neural Networks
    Baykal, Cenk
    Liebenwein, Lucas
    Gilitschenski, Igor
    Feldman, Dan
    Rus, Daniela
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (01): : 26 - 45
  • [23] Provable Preimage Under-Approximation for Neural Networks
    Zhang, Xiyue
    Wang, Benjie
    Kwiatkowska, Marta
    TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PT III, TACAS 2024, 2024, 14572 : 3 - 23
  • [24] Towards Effective Training of Robust Spiking Recurrent Neural Networks under General Input Noise via Provable Analysis
    Zheng, Wendong
    Zhou, Yu
    Chen, Gang
    Gu, Zonghua
    Huang, Kai
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [25] Recurrent neural networks
    Siegelmann, HT
    COMPUTER SCIENCE TODAY, 1995, 1000 : 29 - 45
  • [26] Provable Guarantees for Neural Networks via Gradient Feature Learning
    Shi, Zhenmei
    Wei, Junyi
    Liang, Yingyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] Architecture-Preserving Provable Repair of Deep Neural Networks
    Tao, Zhe
    Nawas, Stephanie
    Mitchell, Jacqueline
    Thakur, Aditya V.
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (PLDI):
  • [28] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
    Sebastian Bitzer
    Stefan J. Kiebel
    Biological Cybernetics, 2012, 106 : 201 - 217
  • [29] Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks
    Bitzer, Sebastian
    Kiebel, Stefan J.
    BIOLOGICAL CYBERNETICS, 2012, 106 (4-5) : 201 - 217
  • [30] PROVABLE TRANSLATIONAL ROBUSTNESS FOR OBJECT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
    Vierling, Axel
    James, Charu
    Berns, Karsten
    Katsaouni, Nikoletta
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 694 - 698