On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

被引：0

作者：

Lyu, Zhaoyan ^{[1
]}

Aminian, Gholamali ^{[2
]}

Rodrigues, Miguel R. D. ^{[1
]}

机构：

[1] UCL, Dept Elect & Elect Engn, Gower St, London WC1E 6BT, England

[2] British Lib, Alan Turing Inst, 96 Euston Rd, London NW1 2DB, England

来源：

ENTROPY | 2023年 / 25卷 / 07期

关键词：

deep learning; information theory; information bottleneck; generalization; fitting; compression;

D O I：

10.3390/e25071063

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

It is well-known that a neural network learning process-along with its connections to fitting, compression, and generalization-is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.

引用

页数：28

共 50 条

[41] Application of neural network fitting for modeling the pneumatic networks bending soft actuator behavior
Salem, Mohamed E. M.
Wang, Qiang
Xu, Ma Hong
[J]. ENGINEERING RESEARCH EXPRESS, 2022, 4 (01):
[42] Exploring how phone classification neural networks learn phonetic information by visualising and interpreting bottleneck features
Bai, Linxue
Weber, Philip
Jancovic, Peter
Russell, Martin
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1472 - 1476
[43] Information Diffusion Prediction via Dynamic Graph Neural Networks
Cao, Zongmai
Han, Kai
Zhu, Jianfu
[J]. PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 1099 - 1104
[44] Flight Target Recognition via Neural Networks and Information Fusion
Zhang, Yang
Duan, Zhenzhen
Zhang, Jian
Liang, Jing
[J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL II: SIGNAL PROCESSING, 2020, 516 : 989 - 998
[45] Lossless medical image compression based on anatomical information and deep neural networks
Min, Qiusha
Wang, Xin
Huang, Bo
Zhou, Zhongwei
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 74
[46] Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks
Chen, Jue
Yuan, Huan
Tan, Jianchao
Chen, Bin
Song, Chengru
Zhang, Di
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5204 - 5213
[47] Triaxial compression behavior of sand and gravel using artificial neural networks (ANN)
Penumadu, D
Zhao, RD
[J]. COMPUTERS AND GEOTECHNICS, 1999, 24 (03) : 207 - 230
[48] SUPPOSED MAXIMUM MUTUAL INFORMATION FOR IMPROVING GENERALIZATION AND INTERPRETATION OF MULTI-LAYERED NEURAL NETWORKS
Kamimura, Ryotaro
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2019, 9 (02) : 123 - 147
[49] Simplified and Gradual Information Control for Improving Generalization Performance of Multi-Layered Neural Networks
Kamimura, Ryotaro
[J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
[50] User behavior prediction via heterogeneous information in social networks
Tian, Xiangbo
Qiu, Liqing
Zhang, Jianyi
[J]. INFORMATION SCIENCES, 2021, 581 : 637 - 654

← 1 2 3 4 5 →