Simplicity Bias in 1-Hidden Layer Neural Networks

被引：0

作者：

Morwani, Depen ^{[1
]}

Batra, Jatin ^{[2
]}

Jain, Prateek ^{[3
]}

Netrapalli, Praneeth ^{[3
]}

机构：

[1] Harvard Univ, Dept Comp Sci, Cambridge, MA 02138 USA

[2] Tata Inst Fundamental Res TIFR, Sch Technol & Comp Sci, Hyderabad, India

[3] Google Res, Mountain View, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent works (Shah et al., 2020; Chen et al., 2021) have demonstrated that neural networks exhibit extreme simplicity bias (SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to the lack of a general and rigorous definition of features, these works showcase SB on semi-synthetic datasets such as Color-MNIST, MNIST-CIFAR where defining features is relatively easier. In this work, we rigorously define as well as thoroughly establish SB for one hidden layer neural networks. More concretely, (i) we define SB as the network essentially being a function of a low dimensional projection of the inputs (ii) theoretically, in the infinite width regime, we show that when the data is linearly separable, the network primarily depends on only the linearly separable (1-dimensional) subspace even in the presence of an arbitrarily large number of other, more complex features which could have led to a significantly more robust classifier, (iii) empirically, we show that models trained on real datasets such as Imagenet and Waterbirds-Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets, iv) finally, we present a natural ensemble approach that encourages diversity in models by training successive models on features not used by earlier models, and demonstrate that it yields models that are significantly more robust to Gaussian noise.

引用

页数：28

共 50 条

[21] Recovery Guarantees for One-hidden-layer Neural Networks
Zhong, Kai
Song, Zhao
Jain, Prateek
Bartlett, Peter L.
Dhillon, Inderjit S.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[22] Limitations of the approximation capabilities of neural networks with one hidden layer
Chui, CK
Li, X
Mhaskar, HN
ADVANCES IN COMPUTATIONAL MATHEMATICS, 1996, 5 (2-3) : 233 - 243
[23] Hidden Layer Visualization for Convolutional Neural Networks: A Brief Review
Rivera, Fabian
Hurtado, Remigio
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 3, 2024, 1013 : 471 - 482
[24] A GEOMETRIC INTERPRETATION OF HIDDEN LAYER UNITS IN FEEDFORWARD NEURAL NETWORKS
MITCHELL, J
NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1992, 3 (01) : 19 - 25
[25] Estimating the number of Hidden Nodes of the Single-hidden-layer Feedforward Neural Networks
Cai, Guang-Wei
Fang, Zhi
Chen, Yue-Feng
2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 172 - 176
[26] The hidden simplicity of metabolic networks is revealed by multireaction dependencies
Kueken, Anika
Langary, Damoun
Nikoloski, Zoran
SCIENCE ADVANCES, 2022, 8 (13)
[27] On the hidden layer-to-layer topology of the representations of reality realised within neural networks
Gafvert, Oliver
Grindrod, Peter
Harrington, Heather A.
Higham, Catherine F.
Higham, Desmond J.
Yim, Ka Man
ENGINEERING COMPUTATIONS, 2025,
[28] Hidden neural networks
Krogh, A
Riis, SK
NEURAL COMPUTATION, 1999, 11 (02) : 541 - 563
[29] On the approximation by single hidden layer feedforward neural networks with fixed weights
Guliyev, Namig J.
Ismailov, Vugar E.
NEURAL NETWORKS, 2018, 98 : 296 - 304
[30] Reconstruction of visual sensory space on the hidden layer in layered neural networks
Shibata, K
Ito, K
ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 405 - 408

← 1 2 3 4 5 →