Simplicity Bias in 1-Hidden Layer Neural Networks

被引:0
|
作者
Morwani, Depen [1 ]
Batra, Jatin [2 ]
Jain, Prateek [3 ]
Netrapalli, Praneeth [3 ]
机构
[1] Harvard Univ, Dept Comp Sci, Cambridge, MA 02138 USA
[2] Tata Inst Fundamental Res TIFR, Sch Technol & Comp Sci, Hyderabad, India
[3] Google Res, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works (Shah et al., 2020; Chen et al., 2021) have demonstrated that neural networks exhibit extreme simplicity bias (SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to the lack of a general and rigorous definition of features, these works showcase SB on semi-synthetic datasets such as Color-MNIST, MNIST-CIFAR where defining features is relatively easier. In this work, we rigorously define as well as thoroughly establish SB for one hidden layer neural networks. More concretely, (i) we define SB as the network essentially being a function of a low dimensional projection of the inputs (ii) theoretically, in the infinite width regime, we show that when the data is linearly separable, the network primarily depends on only the linearly separable (1-dimensional) subspace even in the presence of an arbitrarily large number of other, more complex features which could have led to a significantly more robust classifier, (iii) empirically, we show that models trained on real datasets such as Imagenet and Waterbirds-Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets, iv) finally, we present a natural ensemble approach that encourages diversity in models by training successive models on features not used by earlier models, and demonstrate that it yields models that are significantly more robust to Gaussian noise.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] The Pitfalls of Simplicity Bias in Neural Networks
    Shah, Harshay
    Tamuly, Kaustav
    Raghunathan, Aditi
    Jain, Prateek
    Netrapalli, Praneeth
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [2] Neural network without bias neuron for hidden layer
    Majetic, D.
    Brezak, D.
    Novakovic, B.
    Kasac, J.
    Annals of DAAAM for 2005 & Proceedings of the 16th International DAAAM Symposium: INTELLIGENT MANUFACTURING & AUTOMATION: FOCUS ON YOUNG RESEARCHES AND SCIENTISTS, 2005, : 239 - 240
  • [3] Neural Networks with Marginalized Corrupted Hidden Layer
    Li, Yanjun
    Xin, Xin
    Guo, Ping
    NEURAL INFORMATION PROCESSING, PT III, 2015, 9491 : 506 - 514
  • [4] Neural networks for word recognition: Is a hidden layer necessary?
    Dandurand, Frederic
    Hannagan, Thomas
    Grainger, Jonathan
    COGNITION IN FLUX, 2010, : 688 - 693
  • [5] Regularization of hidden layer unit response for neural networks
    Taga, K
    Kameyama, K
    Toraichi, K
    2003 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS, AND SIGNAL PROCESSING, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2003, : 348 - 351
  • [6] HOW TO DETERMINE THE STUCTRUE OF THE HIDDEN LAYER IN NEURAL NETWORKS
    魏强
    张士军
    张勇传
    水电能源科学, 1997, (01) : 18 - 22
  • [7] Feedforward Neural Networks with a Hidden Layer Regularization Method
    Alemu, Habtamu Zegeye
    Wu, Wei
    Zhao, Junhong
    SYMMETRY-BASEL, 2018, 10 (10):
  • [8] Modular Expansion of the Hidden Layer in Single Layer Feedforward Neural Networks
    Tissera, Migel D.
    McDonnell, Mark D.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 2939 - 2945
  • [9] Collapsing multiple hidden layers in feedforward neural networks to a single hidden layer
    Blue, JL
    Hall, LO
    APPLICATIONS AND SCIENCE OF ARTIFICIAL NEURAL NETWORKS II, 1996, 2760 : 44 - 52
  • [10] DEGREE OF APPROXIMATION BY NEURAL AND TRANSLATION NETWORKS WITH A SINGLE HIDDEN LAYER
    MHASKAR, HN
    MICCHELLI, CA
    ADVANCES IN APPLIED MATHEMATICS, 1995, 16 (02) : 151 - 183