Neural Redshift: Random Networks are not Random Functions

被引:0
|
作者
Teney, Damien [1 ]
Nicolicioiu, Armand Mihai [2 ]
Hartmann, Valentin [3 ]
Abbasnejad, Ehsan [4 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[4] Univ Adelaide, Adelaide, SA, Australia
基金
澳大利亚研究理事会;
关键词
COMPLEXITY;
D O I
10.1109/CVPR52733.2024.00458
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our understanding of the generalization capabilities of neural networks (NNs) is still incomplete. Prevailing explanations are based on implicit biases of gradient descent (GD) but they cannot account for the capabilities of models from gradient-free methods [9] nor the simplicity bias recently observed in untrained networks [29]. This paper seeks other sources of generalization in NNs. Findings. To understand the inductive biases provided by architectures independently from GD, we examine untrained, random-weight networks. Even simple MLPs show strong inductive biases: uniform sampling in weight space yields a very biased distribution of functions in terms of complexity. But unlike common wisdom, NNs do not have an inherent "simplicity bias". This property depends on components such as ReLUs, residual connections, and layer normalizations. Alternative architectures can be built with a bias for any level of complexity. Transformers also inherit all these properties from their building blocks. Implications. We provide a fresh explanation for the success of deep learning independent from gradient-based training. It points at promising avenues for controlling the solutions implemented by trained models.
引用
收藏
页码:4786 / 4796
页数:11
相关论文
共 50 条
  • [1] Random deep neural networks are biased towards simple functions
    De Palma, Giacomo
    Kiani, Bobak T.
    Lloyd, Seth
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] INTERPOLATION OF FUNCTIONS ON RANDOM NETWORKS
    MATVEEV, OV
    DOKLADY AKADEMII NAUK, 1994, 339 (05) : 594 - 597
  • [3] Extended random neural networks
    Martinelli, G
    Mascioli, FMF
    Panella, M
    Rizzi, A
    NEURAL NETS, 2002, 2486 : 75 - 82
  • [4] Random approximants and neural networks
    Makovoz, Y
    JOURNAL OF APPROXIMATION THEORY, 1996, 85 (01) : 98 - 109
  • [5] STABLE RANDOM NEURAL NETWORKS
    GELENBE, E
    COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE II, 1990, 310 (03): : 177 - 180
  • [6] RANDOM MATRICES AND NEURAL NETWORKS
    LUCA, AD
    RICCIARD.LM
    VASUDEVA.R
    KYBERNETIK, 1970, 6 (05): : 163 - +
  • [7] CHAOS IN RANDOM NEURAL NETWORKS
    SOMPOLINSKY, H
    CRISANTI, A
    SOMMERS, HJ
    PHYSICAL REVIEW LETTERS, 1988, 61 (03) : 259 - 262
  • [8] Random Networks with Quantum Boolean Functions
    Franco, Mario
    Zapata, Octavio
    Rosenblueth, David A.
    Gershenson, Carlos
    MATHEMATICS, 2021, 9 (08)
  • [9] Random Sketching for Neural Networks With ReLU
    Wang, Di
    Zeng, Jinshan
    Lin, Shao-Bo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 748 - 762
  • [10] Neural networks meet random forests
    Qiu, Rui
    Xu, Shuntuo
    Yu, Zhou
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2024, 86 (05) : 1435 - 1454