On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

被引:0
|
作者
Lyu, Zhaoyan [1 ]
Aminian, Gholamali [2 ]
Rodrigues, Miguel R. D. [1 ]
机构
[1] UCL, Dept Elect & Elect Engn, Gower St, London WC1E 6BT, England
[2] British Lib, Alan Turing Inst, 96 Euston Rd, London NW1 2DB, England
关键词
deep learning; information theory; information bottleneck; generalization; fitting; compression;
D O I
10.3390/e25071063
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
It is well-known that a neural network learning process-along with its connections to fitting, compression, and generalization-is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] Application of neural network fitting for modeling the pneumatic networks bending soft actuator behavior
    Salem, Mohamed E. M.
    Wang, Qiang
    Xu, Ma Hong
    [J]. ENGINEERING RESEARCH EXPRESS, 2022, 4 (01):
  • [42] Exploring how phone classification neural networks learn phonetic information by visualising and interpreting bottleneck features
    Bai, Linxue
    Weber, Philip
    Jancovic, Peter
    Russell, Martin
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1472 - 1476
  • [43] Information Diffusion Prediction via Dynamic Graph Neural Networks
    Cao, Zongmai
    Han, Kai
    Zhu, Jianfu
    [J]. PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 1099 - 1104
  • [44] Flight Target Recognition via Neural Networks and Information Fusion
    Zhang, Yang
    Duan, Zhenzhen
    Zhang, Jian
    Liang, Jing
    [J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL II: SIGNAL PROCESSING, 2020, 516 : 989 - 998
  • [45] Lossless medical image compression based on anatomical information and deep neural networks
    Min, Qiusha
    Wang, Xin
    Huang, Bo
    Zhou, Zhongwei
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 74
  • [46] Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks
    Chen, Jue
    Yuan, Huan
    Tan, Jianchao
    Chen, Bin
    Song, Chengru
    Zhang, Di
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5204 - 5213
  • [47] Triaxial compression behavior of sand and gravel using artificial neural networks (ANN)
    Penumadu, D
    Zhao, RD
    [J]. COMPUTERS AND GEOTECHNICS, 1999, 24 (03) : 207 - 230
  • [48] SUPPOSED MAXIMUM MUTUAL INFORMATION FOR IMPROVING GENERALIZATION AND INTERPRETATION OF MULTI-LAYERED NEURAL NETWORKS
    Kamimura, Ryotaro
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2019, 9 (02) : 123 - 147
  • [49] Simplified and Gradual Information Control for Improving Generalization Performance of Multi-Layered Neural Networks
    Kamimura, Ryotaro
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [50] User behavior prediction via heterogeneous information in social networks
    Tian, Xiangbo
    Qiu, Liqing
    Zhang, Jianyi
    [J]. INFORMATION SCIENCES, 2021, 581 : 637 - 654