Inductive Bias of Multi-Channel Linear Convolutional Networks with BoundedWeight Norm

被引:0
|
作者
Jagadeesan, Meena [1 ]
Razenshteyn, Ilya [2 ]
Gunasekar, Suriya [3 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] CipherMode Labs, Los Angeles, CA USA
[3] Microsoft Res, Mountain View, CA USA
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We provide a function space characterization of the inductive bias resulting from minimizing the l(2) norm of the weights in multi-channel convolutional neural networks with linear activations and empirically test our resulting hypothesis on ReLU networks trained using gradient descent. We define an induced regularizer in the function space as the minimum l(2) norm of weights of a network required to realize a function. For two layer linear convolutional networks with C output channels and kernel size K, we show the following: (a) If the inputs to the network are single channeled, the induced regularizer for any K is independent of the number of output channels C. Furthermore, we derive the regularizer is a norm given by a semidefinite program (SDP). (b) In contrast, for multi-channel inputs, multiple output channels can be necessary to merely realize all matrix-valued linear functions and thus the inductive bias does depend on C. However, for sufficiently large C, the induced regularizer is again given by an SDP that is independent of C. In particular, the induced regularizer for K = 1 and K = D (input dimension) are given in closed form as the nuclear norm and the l(2,1) group-sparse norm, respectively, of the Fourier coefficients of the linear predictor. We investigate the broader applicability of our theoretical results to implicit regularization from gradient descent on linear and ReLU networks through experiments on MNIST and CIFAR-10 datasets.
引用
收藏
页数:50
相关论文
共 50 条
  • [1] Multi-Channel Differential Synchronous Demodulator for Linear Inductive Position Sensor
    Xue, Jun
    Wang, Dezhi
    RADIOENGINEERING, 2024, 33 (04) : 519 - 525
  • [2] Multi-channel convolutional neural networks for materials properties prediction
    Zheng, Xiaolong
    Zheng, Peng
    Zheng, Liang
    Zhang, Yang
    Zhang, Rui-Zhi
    COMPUTATIONAL MATERIALS SCIENCE, 2020, 173
  • [3] Multi-Channel Recurrent Convolutional Neural Networks for Energy Disaggregation
    Kaselimi, Maria
    Protopapadakis, Eftychios
    Voulodimos, Athanasios
    Doulamis, Nikolaos
    Doulamis, Anastasios
    IEEE ACCESS, 2019, 7 : 81047 - 81056
  • [4] Quantum convolutional neural networks for multi-channel supervised learning
    Smaldone, Anthony M.
    Kyro, Gregory W.
    Batista, Victor S.
    QUANTUM MACHINE INTELLIGENCE, 2023, 5 (02)
  • [5] Efficient transfer learning for multi-channel convolutional neural networks
    de La Comble, Alois
    Prepin, Ken
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [6] Quantum convolutional neural networks for multi-channel supervised learning
    Anthony M. Smaldone
    Gregory W. Kyro
    Victor S. Batista
    Quantum Machine Intelligence, 2023, 5
  • [7] Multi-channel coronal hole detection with convolutional neural networks
    Jarolim, R.
    Veronig, A. M.
    Hofmeister, S.
    Heinemann, S. G.
    Temmer, M.
    Podladchikova, T.
    Dissauer, K.
    ASTRONOMY & ASTROPHYSICS, 2021, 652
  • [8] Multi-Channel Fetal ECG Denoising With Deep Convolutional Neural Networks
    Fotiadou, Eleni
    Vullings, Rik
    FRONTIERS IN PEDIATRICS, 2020, 8
  • [9] Multi-Channel Convolutional Neural Networks for Image Super-Resolution
    Ohtani, Shinya
    Kato, Yu
    Kuroki, Nobutaka
    Hirose, Tetsuya
    Numa, Masahiro
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (02) : 572 - 580
  • [10] Image Super-Resolution with Multi-Channel Convolutional Neural Networks
    Kato, Yu
    Ohtani, Shinya
    Kuroki, Nobutaka
    Hirose, Tetsuya
    Numa, Masahiro
    2016 14TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS), 2016,