NORMALIZATION EFFECTS ON DEEP NEURAL NETWORKS

被引:1
|
作者
Yu, Jiahui [1 ]
Spiliopoulos, Konstantinos [1 ]
机构
[1] Boston Univ, Dept Math & Stat, Boston, MA 02215 USA
来源
FOUNDATIONS OF DATA SCIENCE | 2023年 / 5卷 / 03期
基金
美国国家科学基金会;
关键词
Machine learning; neural networks; normalization effect; scaling effects; asymptotic expansions; out-of-sample performance; APPROXIMATION;
D O I
10.3934/fods.2023004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We study the effect of normalization on the layers of deep neural networks of feed-forward type. A given layer i with Ni hidden units is allowed to be normalized by 1/N gamma i i with-yi is an element of [1/2, 1] and we study the effect of the choice of the-yi on the statistical behavior of the neural network's output (such as variance) as well as on the test accuracy on the MNIST data set. We find that in terms of variance of the neural network's output and test accuracy the best choice is to choose the-yi's to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network's behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network's output. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the Ni grow to infinity.
引用
收藏
页码:389 / 465
页数:77
相关论文
共 50 条
  • [31] Input signals normalization in Kohonen neural networks
    Bielecki, Andrzej
    Bielecka, Marzena
    Chmielowiec, Anna
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2008, PROCEEDINGS, 2008, 5097 : 3 - +
  • [32] Riemannian batch normalization for SPD neural networks
    Brooks, Daniel
    Schwander, Olivier
    Barbaresco, Frederic
    Schneider, Jean-Yves
    Cord, Matthieu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [33] Neural networks with divisive normalization for image segmentation
    Hernandez-Camara, Pablo
    Vila-Tomas, Jorge
    Laparra, Valero
    Malo, Jesus
    [J]. PATTERN RECOGNITION LETTERS, 2023, 173 : 64 - 71
  • [34] Learning graph normalization for graph neural networks
    Chen, Yihao
    Tang, Xin
    Qi, Xianbiao
    Li, Chun-Guang
    Xiao, Rong
    [J]. NEUROCOMPUTING, 2022, 493 : 613 - 625
  • [35] Inherent Weight Normalization in Stochastic Neural Networks
    Detorakis, Georgios
    Dutta, Sourav
    Khanna, Abhishek
    Jerry, Matthew
    Datta, Suman
    Neftci, Emre
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [36] Adaptive Input Normalization for Quantized Neural Networks
    Schmidt, Jan
    Fiser, Petr
    Skrbek, Miroslav
    [J]. 2024 27TH INTERNATIONAL SYMPOSIUM ON DESIGN & DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS, DDECS, 2024, : 130 - 135
  • [37] Deep neural network for fringe pattern filtering and normalization
    Reyes-Figueroa, Alan
    Flores, Victor H.
    Rivera, Mariano
    [J]. APPLIED OPTICS, 2021, 60 (07) : 2022 - 2036
  • [38] A note on factor normalization for deep neural network models
    Haobo Qi
    Jing Zhou
    Hansheng Wang
    [J]. Scientific Reports, 12
  • [39] Irrelevant Variability Normalization via Hierarchical Deep Neural Networks for Online Handwritten Chinese Character Recognition
    Du, Jun
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 303 - 308
  • [40] A note on factor normalization for deep neural network models
    Qi, Haobo
    Zhou, Jing
    Wang, Hansheng
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)