On the Provable Generalization of Recurrent Neural Networks

被引:0
|
作者
Wang, Lifu [1 ]
Shen, Bo [1 ]
Hu, Bo [1 ]
Cao, Xing [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: (1) For a RNN with input sequence x = (X-1, X-2,..., X-L), previous works study to learn functions that are summation of f(ss(T)(l) X-l) and require normalized conditions that ||X-l|| <= epsilon with some very small. depending on the complexity of f. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length L. (2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form f(ss(T) [X-l1,..., X-lN]), which do not belong to the "additive" concept class, i,e., the summation of function f(X-l). And we show that when either N or l(0) = max(l(1),.., l(N)) - min(l(1),.., l(N)) is small, f(ss(T) [X-l1,..., X-lN]) will be learnable with the number iterations and samples scaling almost-polynomially in the input length L.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Can SGD Learn Recurrent Neural Networks with Provable Generalization?
    Allen-Zhu, Zeyuan
    Li, Yuanzhi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] On Generalization Bounds of a Family of Recurrent Neural Networks
    Chen, Minshuo
    Li, Xingguo
    Zhao, Tuo
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1233 - 1242
  • [3] A Generalization of Recurrent Neural Networks for Graph Embedding
    Han, Xiao
    Zhang, Chunhong
    Guo, Chenchen
    Ji, Yang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 247 - 259
  • [4] Generalization and risk bounds for recurrent neural networks
    Cheng, Xuewei
    Huang, Ke
    Ma, Shujie
    [J]. Neurocomputing, 2025, 616
  • [5] PRUNING RECURRENT NEURAL NETWORKS FOR IMPROVED GENERALIZATION PERFORMANCE
    GILES, CL
    OMLIN, CW
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (05): : 848 - 851
  • [6] An analysis of noise in recurrent neural networks: Convergence and generalization
    Jim, KC
    Giles, CL
    Horne, BG
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (06): : 1424 - 1438
  • [7] GENERALIZATION OF BACK-PROPAGATION TO RECURRENT NEURAL NETWORKS
    PINEDA, FJ
    [J]. PHYSICAL REVIEW LETTERS, 1987, 59 (19) : 2229 - 2232
  • [8] Provable Repair of Deep Neural Networks
    Sotoudeh, Matthew
    Thakur, Aditya, V
    [J]. PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 588 - 603
  • [9] Theoretically Provable Spiking Neural Networks
    Zhang, Shao-Qun
    Zhou, Zhi-Hua
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
    Frei, Spencer
    Cao, Yuan
    Gu, Quanquan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139