Information Bottleneck Theory Based Exploration of Cascade Learning

被引:1
|
作者
Du, Xin [1 ]
Farrahi, Katayoun [1 ]
Niranjan, Mahesan [1 ]
机构
[1] Univ Southampton, Sch Elect & Comp Sci, Southampton SO17 3AS, Hants, England
基金
英国工程与自然科学研究理事会;
关键词
information bottleneck theory; Cascade Learning; neural networks;
D O I
10.3390/e23101360
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information-compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] On the information bottleneck theory of deep learning
    Saxe, Andrew M.
    Bansal, Yamini
    Dapello, Joel
    Advani, Madhu
    Kolchinsky, Artemy
    Tracey, Brendan D.
    Cox, David D.
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2019, 2019 (12):
  • [2] Information Bottleneck: Theory and Applications in Deep Learning
    Geiger, Bernhard C.
    Kubin, Gernot
    ENTROPY, 2020, 22 (12)
  • [3] Perturbation Theory for the Information Bottleneck
    Ngampruetikorn, Vudtiwat
    Schwab, David J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Learning and generalization with the information bottleneck
    Shamir, Ohad
    Sabato, Sivan
    Tishby, Naftali
    THEORETICAL COMPUTER SCIENCE, 2010, 411 (29-30) : 2696 - 2711
  • [5] Learning and Generalization with the Information Bottleneck
    Shamir, Ohad
    Sabato, Sivan
    Tishby, Naftali
    ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 92 - 107
  • [6] Information Bottleneck and Aggregated Learning
    Soflaei, Masoumeh
    Zhang, Richong
    Guo, Hongyu
    Al-Bashabsheh, Ali
    Mao, Yongyi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14807 - 14820
  • [7] Unsupervised MR harmonization by learning disentangled representations using information bottleneck theory
    Zuo, Lianrui
    Dewey, Blake E.
    Liu, Yihao
    He, Yufan
    Newsome, Scott D.
    Mowry, Ellen M.
    Resnick, Susan M.
    Prince, Jerry L.
    Carass, Aaron
    NEUROIMAGE, 2021, 243
  • [8] Theory and Application of the Information Bottleneck Method
    Lewandowsky, Jan
    Bauch, Gerhard
    ENTROPY, 2024, 26 (03)
  • [9] An Exploration of Vocal Music Teaching in College Based on Social Learning Theory with Information Technology
    Tan, ZhuWen
    Tian, Yuan
    Deng, YingZhen
    Wu, ZhenGuo
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT 5, 2011, 218 : 43 - +
  • [10] Deep Learning and the Information Bottleneck Principle
    Tishby, Naftali
    Zaslavsky, Noga
    2015 IEEE INFORMATION THEORY WORKSHOP (ITW), 2015,