On the training dynamics of deep networks with L2 regularization

被引:0
|
作者
Lewkowycz, Aitor [1 ]
Gur-Ari, Guy [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the role of L-2 regularization in deep learning, and uncover simple relations between the performance of the model, the L-2 coefficient, the learning rate, and the number of training steps. These empirical relations hold when the network is overparameterized. They can be used to predict the optimal regularization parameter of a given model. In addition, based on these observations we propose a dynamical schedule for the regularization parameter that improves performance and speeds up training. We test these proposals in modern image classification settings. Finally, we show that these empirical relations can be understood theoretically in the context of infinitely wide networks. We derive the gradient flow dynamics of such networks, and compare the role of L-2 regularization in this context with that of linear models.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Enhance the Performance of Deep Neural Networks via L2 Regularization on the Input of Activations
    Shi, Guang
    Zhang, Jiangshe
    Li, Huirong
    Wang, Changpeng
    NEURAL PROCESSING LETTERS, 2019, 50 (01) : 57 - 75
  • [2] Enhance the Performance of Deep Neural Networks via L2 Regularization on the Input of Activations
    Guang Shi
    Jiangshe Zhang
    Huirong Li
    Changpeng Wang
    Neural Processing Letters, 2019, 50 : 57 - 75
  • [3] Deep neural networks with L1 and L2 regularization for high dimensional corporate credit risk prediction
    Yang, Mei
    Lim, Ming K.
    Qu, Yingchi
    Li, Xingzhi
    Ni, Du
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [4] A Hybrid Improved Neural Networks Algorithm Based on L2 and Dropout Regularization
    Xie, Xiaoyun
    Xie, Ming
    Moshayedi, Ata Jahangir
    Skandari, Mohammad Hadi Noori
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [5] A Hybrid Improved Neural Networks Algorithm Based on L2 and Dropout Regularization
    Xie, Xiaoyun
    Xie, Ming
    Moshayedi, Ata Jahangir
    Skandari, Mohammad Hadi Noori
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [7] ELM with L1/L2 regularization constraints
    Feng B.
    Qin K.
    Jiang Z.
    Hanjie Xuebao/Transactions of the China Welding Institution, 2018, 39 (09): : 31 - 35
  • [8] Stochastic PCA with l2 and l1 Regularization
    Mianjy, Poorya
    Arora, Raman
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [9] αl1 - βl2 regularization for sparse recovery
    Ding, Liang
    Han, Weimin
    INVERSE PROBLEMS, 2019, 35 (12)
  • [10] Deciphering the Coevolutionary Dynamics of L2 β-Lactamases via Deep Learning
    Zhu, Yu
    Gu, Jing
    Zhao, Zhuoran
    Chan, A. W. Edith
    Mojica, Maria F.
    Hujer, Andrea M.
    Bonomo, Robert A.
    Haider, Shozeb
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (09) : 3706 - 3717