Simplicity Bias in 1-Hidden Layer Neural Networks

被引:0
|
作者
Morwani, Depen [1 ]
Batra, Jatin [2 ]
Jain, Prateek [3 ]
Netrapalli, Praneeth [3 ]
机构
[1] Harvard Univ, Dept Comp Sci, Cambridge, MA 02138 USA
[2] Tata Inst Fundamental Res TIFR, Sch Technol & Comp Sci, Hyderabad, India
[3] Google Res, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works (Shah et al., 2020; Chen et al., 2021) have demonstrated that neural networks exhibit extreme simplicity bias (SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to the lack of a general and rigorous definition of features, these works showcase SB on semi-synthetic datasets such as Color-MNIST, MNIST-CIFAR where defining features is relatively easier. In this work, we rigorously define as well as thoroughly establish SB for one hidden layer neural networks. More concretely, (i) we define SB as the network essentially being a function of a low dimensional projection of the inputs (ii) theoretically, in the infinite width regime, we show that when the data is linearly separable, the network primarily depends on only the linearly separable (1-dimensional) subspace even in the presence of an arbitrarily large number of other, more complex features which could have led to a significantly more robust classifier, (iii) empirically, we show that models trained on real datasets such as Imagenet and Waterbirds-Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets, iv) finally, we present a natural ensemble approach that encourages diversity in models by training successive models on features not used by earlier models, and demonstrate that it yields models that are significantly more robust to Gaussian noise.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Recovery Guarantees for One-hidden-layer Neural Networks
    Zhong, Kai
    Song, Zhao
    Jain, Prateek
    Bartlett, Peter L.
    Dhillon, Inderjit S.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [22] Limitations of the approximation capabilities of neural networks with one hidden layer
    Chui, CK
    Li, X
    Mhaskar, HN
    ADVANCES IN COMPUTATIONAL MATHEMATICS, 1996, 5 (2-3) : 233 - 243
  • [23] Hidden Layer Visualization for Convolutional Neural Networks: A Brief Review
    Rivera, Fabian
    Hurtado, Remigio
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 3, 2024, 1013 : 471 - 482
  • [25] Estimating the number of Hidden Nodes of the Single-hidden-layer Feedforward Neural Networks
    Cai, Guang-Wei
    Fang, Zhi
    Chen, Yue-Feng
    2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 172 - 176
  • [26] The hidden simplicity of metabolic networks is revealed by multireaction dependencies
    Kueken, Anika
    Langary, Damoun
    Nikoloski, Zoran
    SCIENCE ADVANCES, 2022, 8 (13)
  • [27] On the hidden layer-to-layer topology of the representations of reality realised within neural networks
    Gafvert, Oliver
    Grindrod, Peter
    Harrington, Heather A.
    Higham, Catherine F.
    Higham, Desmond J.
    Yim, Ka Man
    ENGINEERING COMPUTATIONS, 2025,
  • [28] Hidden neural networks
    Krogh, A
    Riis, SK
    NEURAL COMPUTATION, 1999, 11 (02) : 541 - 563
  • [29] On the approximation by single hidden layer feedforward neural networks with fixed weights
    Guliyev, Namig J.
    Ismailov, Vugar E.
    NEURAL NETWORKS, 2018, 98 : 296 - 304
  • [30] Reconstruction of visual sensory space on the hidden layer in layered neural networks
    Shibata, K
    Ito, K
    ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 405 - 408