An empirical analysis of the shift and scale parameters in BatchNorm

被引:4
|
作者
Peerthum, Yashna [1 ]
Stamp, Mark [1 ]
机构
[1] San Jose State Univ, Dept Comp Sci, San Jose, CA 95192 USA
关键词
D O I
10.1016/j.ins.2023.118951
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Batch Normalization (BatchNorm) is a technique that improves the training of deep neural networks, especially Convolutional Neural Networks (CNN). It has been empirically demonstrated that BatchNorm increases performance, stability, and accuracy, although the reasons for such improvements are unclear. BatchNorm includes a normalization step as well as trainable shift and scale parameters. In this paper, we empirically examine the relative contribution to the success of BatchNorm of the normalization step, as compared to the re-parameterization via shifting and scaling. To conduct our experiments, we implement two new optimizers in PyTorch, namely, a version of BatchNorm that we refer to as AffineLayer, which includes the re-parameterization step without normalization, and a version with just the normalization step, that we call BatchNorm-minus. We compare the performance of our AffineLayer and BatchNorm-minus implementations to standard BatchNorm, and we also compare these to the case where no batch normalization is used. We experiment with four ResNet architectures (ResNet18, ResNet34, ResNet50, and ResNet101) over a standard image dataset and multiple batch sizes. Among other findings, we provide empirical evidence that the success of BatchNorm may derive primarily from improved weight initialization.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Empirical Analysis of the Most Relevant Parameters of Codon Substitution Models
    Stefan Zoller
    Adrian Schneider
    Journal of Molecular Evolution, 2010, 70 : 605 - 612
  • [42] Empirical Analysis of the Most Relevant Parameters of Codon Substitution Models
    Zoller, Stefan
    Schneider, Adrian
    JOURNAL OF MOLECULAR EVOLUTION, 2010, 70 (06) : 605 - 612
  • [43] Application of parameters space analysis tools for empirical model validation
    del Barrio, EP
    Guyon, G
    ENERGY AND BUILDINGS, 2004, 36 (01) : 23 - 33
  • [44] Analysis of the goodness of empirical approaches in predicting explosive detonation parameters
    Fernando G. Bastante
    Elena Alonso
    María Araújo
    Julio García Menéndez
    Stochastic Environmental Research and Risk Assessment, 2018, 32 : 2605 - 2618
  • [45] Planning Nervousness in Product Segmentation: Empirical Analysis of Decision Parameters
    Praestholm, Nicolai
    Andersen, Ann-Louise
    Nielsen, Kjeld
    Bruno, Thomas Ditlev
    ADVANCES IN PRODUCTION MANAGEMENT SYSTEMS: INNOVATIVE AND KNOWLEDGE-BASED PRODUCTION MANAGEMENT IN A GLOBAL-LOCAL WORLD, PT 1, 2014, 438 : 411 - 418
  • [46] Analysis of the goodness of empirical approaches in predicting explosive detonation parameters
    Bastante, Fernando G.
    Alonso, Elena
    Araujo, Maria
    Garcia Menendez, Julio
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2018, 32 (09) : 2605 - 2618
  • [47] Empirical Modeling of Turning Parameters Using Grey Relational Analysis
    Abhang, L. B.
    Hameedullah, M.
    MECHANICAL AND AEROSPACE ENGINEERING, PTS 1-7, 2012, 110-116 : 2596 - 2603
  • [49] A Large-scale Empirical Analysis of Ransomware Activities in Bitcoin
    Wang, Kai
    Pang, Jun
    Chen, Dingjie
    Zhao, Yu
    Huang, Dapeng
    Chen, Chen
    Han, Weili
    ACM TRANSACTIONS ON THE WEB, 2022, 16 (02)
  • [50] ECONOMIC SCALE, ENERGY AND SUSTAINABILITY - AN INTERNATIONAL EMPIRICAL-ANALYSIS
    TEMPLET, PH
    INTERNATIONAL JOURNAL OF SUSTAINABLE DEVELOPMENT AND WORLD ECOLOGY, 1995, 2 (03): : 153 - 165