Convergence of deep ReLU networks

被引:10
|
作者
Xu, Yuesheng [1 ]
Zhang, Haizhang [2 ]
机构
[1] Old Dominion Univ, Dept Math & Stat, Norfolk, VA 23529 USA
[2] Sun Yat sen Univ, Sch Math Zhuhai, Zhuhai, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
Deep learning; ReLU networks; Activation domains; Infinite product of matrices; ERROR-BOUNDS; WIDTH;
D O I
10.1016/j.neucom.2023.127174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to convergence of deep neural networks. Experiments are conducted to mathematically verify the results and to illustrate their potential usefulness in initialization of deep neural networks.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Correction: Approximation of Nonlinear Functionals Using Deep ReLU Networks
    Linhao Song
    Jun Fan
    Di-Rong Chen
    Ding-Xuan Zhou
    Journal of Fourier Analysis and Applications, 2023, 29
  • [22] Robust nonparametric regression based on deep ReLU neural networks
    Chen, Juntong
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2024, 233
  • [23] Precise characterization of the prior predictive distribution of deep ReLU networks
    Noci, Lorenzo
    Bachmann, Gregor
    Roth, Kevin
    Nowozin, Sebastian
    Hofmann, Thomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [24] Deep ReLU Networks Have Surprisingly Few Activation Patterns
    Hanin, Boris
    Rolnick, David
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [25] On Centralization and Unitization of Batch Normalization for Deep ReLU Neural Networks
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2827 - 2841
  • [26] Deep ReLU neural networks in high-dimensional approximation
    Dung, Dinh
    Nguyen, Van Kien
    NEURAL NETWORKS, 2021, 142 : 619 - 635
  • [27] ReLU deep neural networks from the hierarchical basis perspective
    He, Juncai
    Li, Lin
    Xu, Jinchao
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2022, 120 : 105 - 114
  • [28] HOW DO NOISE TAILS IMPACT ON DEEP RELU NETWORKS?
    Fan, Jianqian
    Gu, Yihong
    Zhou, Wen-Xin
    ANNALS OF STATISTICS, 2024, 52 (04): : 1845 - 1871
  • [29] Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
    Jentzen, Arnulf
    Riekert, Adrian
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2023, 517 (02)
  • [30] Gradient descent optimizes over-parameterized deep ReLU networks
    Difan Zou
    Yuan Cao
    Dongruo Zhou
    Quanquan Gu
    Machine Learning, 2020, 109 : 467 - 492