Convergence of deep ReLU networks

被引:10
|
作者
Xu, Yuesheng [1 ]
Zhang, Haizhang [2 ]
机构
[1] Old Dominion Univ, Dept Math & Stat, Norfolk, VA 23529 USA
[2] Sun Yat sen Univ, Sch Math Zhuhai, Zhuhai, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
Deep learning; ReLU networks; Activation domains; Infinite product of matrices; ERROR-BOUNDS; WIDTH;
D O I
10.1016/j.neucom.2023.127174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to convergence of deep neural networks. Experiments are conducted to mathematically verify the results and to illustrate their potential usefulness in initialization of deep neural networks.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] On the CVP for the root lattices via folding with deep ReLU neural networks
    Corlay, Vincent
    Boutros, Joseph J.
    Ciblat, Philippe
    Brunel, Loic
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 1622 - 1626
  • [32] DEEP RELU NETWORKS OVERCOME THE CURSE OF DIMENSIONALITY FOR GENERALIZED BANDLIMITED FUNCTIONS
    Montanelli, Hadrien
    Yang, Haizhao
    Du, Qiang
    JOURNAL OF COMPUTATIONAL MATHEMATICS, 2021, 39 (06): : 801 - 815
  • [33] On the uniform approximation estimation of deep ReLU networks via frequency decomposition
    Chen, Liang
    Liu, Wenjun
    AIMS MATHEMATICS, 2022, 7 (10): : 19018 - 19025
  • [34] Deep ReLU networks and high-order finite element methods
    Opschoor, Joost A. A.
    Petersen, Philipp C.
    Schwab, Christoph
    ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 715 - 770
  • [35] Gradient descent optimizes over-parameterized deep ReLU networks
    Zou, Difan
    Cao, Yuan
    Zhou, Dongruo
    Gu, Quanquan
    MACHINE LEARNING, 2020, 109 (03) : 467 - 492
  • [36] New Error Bounds for Deep ReLU Networks Using Sparse Grids
    Montanelli, Hadrien
    Du, Qiang
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01): : 78 - 92
  • [37] Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
    Chen, Minshuo
    Jiang, Haoming
    Liao, Wenjing
    Zhao, Tuo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [38] NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION
    Schmidt-Hieber, Johannes
    ANNALS OF STATISTICS, 2020, 48 (04): : 1875 - 1897
  • [39] Approximation in shift-invariant spaces with deep ReLU neural networks
    Yang, Yunfei
    Li, Zhen
    Wang, Yang
    NEURAL NETWORKS, 2022, 153 : 269 - 281
  • [40] Trajectory growth lower bounds for random sparse deep ReLU networks
    Price, Ilan
    Tanner, Jared
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1004 - 1009