Splitting of Composite Neural Networks via Proximal Operator With Information Bottleneck

被引:0
|
作者
Han, Sang-Il [1 ]
Nakamura, Kensuke [1 ]
Hong, Byung-Woo [1 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
关键词
Linear programming; Deep learning; Task analysis; Mutual information; Training; Optimization methods; Biological neural networks; information bottleneck; stochastic gradient descent; proximal algorithm;
D O I
10.1109/ACCESS.2023.3346697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning has achieved efficient success in the field of machine learning, made possible by the emergence of efficient optimization methods such as Stochastic Gradient Descent (SGD) and its variants. Simultaneously, the Information Bottleneck theory (IB) has been studied to train neural networks, aiming to enhance the performance of optimization methods. However, previous works have focused on their specific tasks, and the effect of the IB theory on general deep learning tasks is still unclear. In this study, we introduce a new method inspired by the proximal operator, which sequentially updates the neural network parameters based on the defined bottleneck features between the forward and backward networks. Unlike the conventional proximal-based methods, we consider the second-order gradients of the objective function to achieve better updates for the forward networks. In contrast to SGD-based methods, our approach involves accessing the network's black box, and incorporating the bottleneck feature update process into the parameter update process. This way, from the perspective of the IB theory, the data is well compressed up to the bottleneck feature, ensuring that the compressed information maintains sufficient mutual information up to the final output. To demonstrate the performance of the proposed approach, we applied the method to various optimizers with several tasks and analyzed the results by training on both the MNIST dataset and CIFAR-10 dataset. We also conducted several ablation studies by modifying the components of the proposed algorithm to further validate its performance.
引用
收藏
页码:157 / 167
页数:11
相关论文
共 50 条
  • [41] Flight Target Recognition via Neural Networks and Information Fusion
    Zhang, Yang
    Duan, Zhenzhen
    Zhang, Jian
    Liang, Jing
    [J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL II: SIGNAL PROCESSING, 2020, 516 : 989 - 998
  • [42] Learning hidden variable networks: The information bottleneck approach
    Elidan, G
    Friedman, N
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2005, 6 : 81 - 127
  • [43] Mitigating Confounding Bias in Recommendation via Information Bottleneck
    Liu, Dugang
    Cheng, Pengxiang
    Zhu, Hong
    Dong, Zhenhua
    He, Xiuqiang
    Pan, Weike
    Ming, Zhong
    [J]. 15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 351 - 360
  • [44] A Provably Convergent Information Bottleneck Solution via ADMM
    Huang, Teng-Hui
    El Gamal, Aly
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 43 - 48
  • [45] Function Identification in Neuron Populations via Information Bottleneck
    Buddha, S. Kartik
    So, Kelvin
    Carmena, Jose M.
    Gastpar, Michael C.
    [J]. ENTROPY, 2013, 15 (05) : 1587 - 1608
  • [46] Enhancing Adversarial Transferability via Information Bottleneck Constraints
    Qi, Biqing
    Gao, Junqi
    Liu, Jianxing
    Wu, Ligang
    Zhou, Bowen
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1414 - 1418
  • [47] Multitask Image Clustering via Deep Information Bottleneck
    Yan, Xiaoqiang
    Mao, Yiqiao
    Li, Mingyuan
    Ye, Yangdong
    Yu, Hui
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1868 - 1881
  • [48] Improving Adversarial Robustness via Information Bottleneck Distillation
    Kuang, Huafeng
    Liu, Hong
    Wu, YongJian
    Satoh, Shin'ichi
    Ji, Rongrong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Correction to: Parseval Proximal Neural Networks
    Marzieh Hasannasab
    Johannes Hertrich
    Sebastian Neumayer
    Gerlind Plonka
    Simon Setzer
    Gabriele Steidl
    [J]. Journal of Fourier Analysis and Applications, 2021, 27 (3)
  • [50] Operator compression with deep neural networks
    Fabian Kröpfl
    Roland Maier
    Daniel Peterseim
    [J]. Advances in Continuous and Discrete Models, 2022