Convergence of deep ReLU networks

被引：10

作者：

Xu, Yuesheng ^{[1
]}

Zhang, Haizhang ^{[2
]}

机构：

[1] Old Dominion Univ, Dept Math & Stat, Norfolk, VA 23529 USA

[2] Sun Yat sen Univ, Sch Math Zhuhai, Zhuhai, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 571卷

基金：

美国国家科学基金会; 中国国家自然科学基金; 美国国家卫生研究院;

关键词：

Deep learning; ReLU networks; Activation domains; Infinite product of matrices; ERROR-BOUNDS; WIDTH;

D O I：

10.1016/j.neucom.2023.127174

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity. Moreover, we obtain sufficient conditions in terms of the weight matrices and bias vectors at hidden layers for pointwise convergence of deep ReLU networks. These results provide mathematical insights to convergence of deep neural networks. Experiments are conducted to mathematically verify the results and to illustrate their potential usefulness in initialization of deep neural networks.

引用

页数：13

共 50 条

[31] On the CVP for the root lattices via folding with deep ReLU neural networks
Corlay, Vincent
Boutros, Joseph J.
Ciblat, Philippe
Brunel, Loic
2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 1622 - 1626
[32] DEEP RELU NETWORKS OVERCOME THE CURSE OF DIMENSIONALITY FOR GENERALIZED BANDLIMITED FUNCTIONS
Montanelli, Hadrien
Yang, Haizhao
Du, Qiang
JOURNAL OF COMPUTATIONAL MATHEMATICS, 2021, 39 (06): : 801 - 815
[33] On the uniform approximation estimation of deep ReLU networks via frequency decomposition
Chen, Liang
Liu, Wenjun
AIMS MATHEMATICS, 2022, 7 (10): : 19018 - 19025
[34] Deep ReLU networks and high-order finite element methods
Opschoor, Joost A. A.
Petersen, Philipp C.
Schwab, Christoph
ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 715 - 770
[35] Gradient descent optimizes over-parameterized deep ReLU networks
Zou, Difan
Cao, Yuan
Zhou, Dongruo
Gu, Quanquan
MACHINE LEARNING, 2020, 109 (03) : 467 - 492
[36] New Error Bounds for Deep ReLU Networks Using Sparse Grids
Montanelli, Hadrien
Du, Qiang
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (01): : 78 - 92
[37] Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
Chen, Minshuo
Jiang, Haoming
Liao, Wenjing
Zhao, Tuo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[38] NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION
Schmidt-Hieber, Johannes
ANNALS OF STATISTICS, 2020, 48 (04): : 1875 - 1897
[39] Approximation in shift-invariant spaces with deep ReLU neural networks
Yang, Yunfei
Li, Zhen
Wang, Yang
NEURAL NETWORKS, 2022, 153 : 269 - 281
[40] Trajectory growth lower bounds for random sparse deep ReLU networks
Price, Ilan
Tanner, Jared
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1004 - 1009

← 1 2 3 4 5 →