Splitting of Composite Neural Networks via Proximal Operator With Information Bottleneck

被引:0
|
作者
Han, Sang-Il [1 ]
Nakamura, Kensuke [1 ]
Hong, Byung-Woo [1 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
关键词
Linear programming; Deep learning; Task analysis; Mutual information; Training; Optimization methods; Biological neural networks; information bottleneck; stochastic gradient descent; proximal algorithm;
D O I
10.1109/ACCESS.2023.3346697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning has achieved efficient success in the field of machine learning, made possible by the emergence of efficient optimization methods such as Stochastic Gradient Descent (SGD) and its variants. Simultaneously, the Information Bottleneck theory (IB) has been studied to train neural networks, aiming to enhance the performance of optimization methods. However, previous works have focused on their specific tasks, and the effect of the IB theory on general deep learning tasks is still unclear. In this study, we introduce a new method inspired by the proximal operator, which sequentially updates the neural network parameters based on the defined bottleneck features between the forward and backward networks. Unlike the conventional proximal-based methods, we consider the second-order gradients of the objective function to achieve better updates for the forward networks. In contrast to SGD-based methods, our approach involves accessing the network's black box, and incorporating the bottleneck feature update process into the parameter update process. This way, from the perspective of the IB theory, the data is well compressed up to the bottleneck feature, ensuring that the compressed information maintains sufficient mutual information up to the final output. To demonstrate the performance of the proposed approach, we applied the method to various optimizers with several tasks and analyzed the results by training on both the MNIST dataset and CIFAR-10 dataset. We also conducted several ablation studies by modifying the components of the proposed algorithm to further validate its performance.
引用
收藏
页码:157 / 167
页数:11
相关论文
共 50 条
  • [1] DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting
    Chen, Shaoru
    Wong, Eric
    Kolter, J. Zico
    Fazlyab, Mahyar
    [J]. IEEE Open Journal of Control Systems, 2022, 1 : 126 - 140
  • [2] Sentiment Analysis via Deep Multichannel Neural Networks With Variational Information Bottleneck
    Gu, Tong
    Xu, Guoliang
    Luo, Jiangtao
    [J]. IEEE ACCESS, 2020, 8 : 121014 - 121021
  • [3] TIME SERIES PREDICTION VIA RECURRENT NEURAL NETWORKS WITH THE INFORMATION BOTTLENECK PRINCIPLE
    Xu, Duo
    Fekri, Faramarz
    [J]. 2018 IEEE 19TH INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (SPAWC), 2018, : 276 - 280
  • [4] Information Bottleneck Theory on Convolutional Neural Networks
    Li, Junjie
    Liu, Ding
    [J]. NEURAL PROCESSING LETTERS, 2021, 53 (02) : 1385 - 1400
  • [5] Information Bottleneck Theory on Convolutional Neural Networks
    Junjie Li
    Ding Liu
    [J]. Neural Processing Letters, 2021, 53 : 1385 - 1400
  • [6] Compressing Neural Networks using the Variational Information Bottleneck
    Dai, Bin
    Zhu, Chen
    Guo, Baining
    Wipf, David
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] Training Neural Networks by Lifted Proximal Operator Machines
    Li, Jia
    Xiao, Mingqing
    Fang, Cong
    Dai, Yue
    Xu, Chao
    Lin, Zhouchen
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (06) : 3334 - 3348
  • [8] Markov Information Bottleneck to Improve Information Flow in Stochastic Neural Networks
    Thanh Tang Nguyen
    Choi, Jaesik
    [J]. ENTROPY, 2019, 21 (10)
  • [9] On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
    Lyu, Zhaoyan
    Aminian, Gholamali
    Rodrigues, Miguel R. D.
    [J]. ENTROPY, 2023, 25 (07)
  • [10] Information Bottleneck in Control Tasks with Recurrent Spiking Neural Networks
    Vasu, Madhavun Candadai
    Izquierdo, Eduardo J.
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 236 - 244