Redundant representations help generalization in wide neural networks

被引:0
|
作者
Doimo, Diego [1 ]
Glielmo, Aldo [2 ]
Goldt, Sebastian [1 ]
Laio, Alessandro [1 ]
机构
[1] Scuola Int Super Studi Avanzati, Trieste, Italy
[2] Bank Italy, Int Sch Adv Studies, Trieste, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN that interpolates its training data will typically improve its generalization performance. Explaining the mechanism behind this "benign overfitting" in deep networks remains an outstanding challenge. Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information and differ from each other only by statistically independent noise. The number of such groups increases linearly with the width of the layer, but only if the width is above a critical value. We show that redundant neurons appear only when the training is regularized and the training error is zero.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Strong generalization in quantum neural networks
    Jinzhe Jiang
    Yaqian Zhao
    Rengang Li
    Chen Li
    Zhenhua Guo
    Baoyu Fan
    Xuelei Li
    Ruyang Li
    Xin Zhang
    Quantum Information Processing, 22
  • [22] CATASTROPHIC INTERFERENCE AND GENERALIZATION IN NEURAL NETWORKS
    LEWANDOWSKY, S
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 653 - 653
  • [23] Strong generalization in quantum neural networks
    Jiang, Jinzhe
    Zhao, Yaqian
    Li, Rengang
    Li, Chen
    Guo, Zhenhua
    Fan, Baoyu
    Li, Xuelei
    Li, Ruyang
    Zhang, Xin
    QUANTUM INFORMATION PROCESSING, 2023, 22 (12)
  • [24] GENERALIZATION AND SPECIALIZATION IN ARTIFICIAL NEURAL NETWORKS
    HAMPSON, S
    PROGRESS IN NEUROBIOLOGY, 1991, 37 (05) : 383 - 431
  • [25] Wavelet sampling and generalization in neural networks
    Zhang, Zhiguo
    Kon, Mark A.
    NEUROCOMPUTING, 2017, 267 : 36 - 54
  • [26] Using neural networks for generalization problems
    Dutta, Soumitra
    Shekhar, Shashi
    Neural Networks, 1988, 1 (1 SUPPL)
  • [27] Slope and Generalization Properties of Neural Networks
    Johansson, Anton
    Engsner, Niklas
    Strannegard, Claes
    Mostad, Petter
    2022 34TH WORKSHOP OF THE SWEDISH ARTIFICIAL INTELLIGENCE SOCIETY (SAIS 2022), 2022, : 28 - 36
  • [28] On the Provable Generalization of Recurrent Neural Networks
    Wang, Lifu
    Shen, Bo
    Hu, Bo
    Cao, Xing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [29] Stacked generalization in neural networks: Generalization on statistically neutral problems
    Ghorbani, AA
    Owrangh, K
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1715 - 1720
  • [30] Neural Tangent Kernel: Convergence and Generalization in Neural Networks
    Jacot, Arthur
    Gabriel, Franck
    Hongler, Clement
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31