Isomorphic model-based initialization for convolutional neural networks

被引:2
|
作者
Zhang, Hong [1 ]
Li, Yang [1 ]
Yang, Hanqing [1 ]
He, Bin [1 ]
Zhang, Yu [1 ]
机构
[1] Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
关键词
Convolutional neural networks; Weight initialization; Isomorphic model; Structural weight transformation;
D O I
10.1016/j.jvcir.2022.103677
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep convolutional neural networks(CNNs) are often designed to be scalable, leading to the model family concept. A model family is a large (possibly infinite) collection of related neural network architectures. The isomorphism of a model family refers to the fact that the models within it share the same high-level structure. Meanwhile, the models within the model family are called isomorphic models for each other. Existing weight initialization methods for CNNs use random initialization or data-driven initialization. Even though these methods can perform satisfactory initialization, the isomorphism of model families is rarely explored. This work proposes an isomorphic model-based initialization method (IM Init) for CNNs. It can initialize any network with another well-trained isomorphic model in the same model family. We first formulate the widely used general network structure of CNNs. Then a structural weight transformation is presented to transform the weight between two isomorphic models. Finally, we apply our IM Init to the model down-sampling and up-sampling scenarios and confirm its effectiveness in improving accuracy and convergence speed through experiments on various image classification datasets. In the model down-sampling scenario, IM Init initializes the smaller target model with a larger well-trained source model. It improves the accuracy of RegNet200MF by 1.59% on the CIFAR-100 dataset and 1.9% on the CUB200 dataset. Inversely, IM Init initializes the larger target model with a smaller well-trained source model in the model up-sampling scenario. It significantly speeds up the convergence of RegNet600MF and improves the accuracy by 30.10% under short training schedules. Code will be available.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Initialization of Convolutional Neural Networks by Gabor Filters
    Ozbulak, Gokhan
    Ekenel, Hazim Kemal
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [2] MODEL-BASED NEURAL NETWORKS
    CAELLI, TM
    SQUIRE, DM
    WILD, TPJ
    [J]. NEURAL NETWORKS, 1993, 6 (05) : 613 - 625
  • [3] Model-based neural networks
    Fontaine, JL
    Germain, A
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2001, 25 (7-8) : 1045 - 1054
  • [4] On the use of Convolutional Neural Networks for Graphical Model-based Human Pose Estimation
    Huynh Vu
    Cheng, Eva
    Wilkinson, Richardt
    Lech, Margaret
    [J]. 2017 INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN SIGNAL PROCESSING, TELECOMMUNICATIONS & COMPUTING (SIGTELCOM), 2017, : 88 - 93
  • [5] Autoregressive Model-Based Structural Damage Identification and Localization Using Convolutional Neural Networks
    Qizhi Tang
    Jianting Zhou
    Jingzhou Xin
    Siyu Zhao
    Yingxin Zhou
    [J]. KSCE Journal of Civil Engineering, 2020, 24 : 2173 - 2185
  • [6] Autoregressive Model-Based Structural Damage Identification and Localization Using Convolutional Neural Networks
    Tang, Qizhi
    Zhou, Jianting
    Xin, Jingzhou
    Zhao, Siyu
    Zhou, Yingxin
    [J]. KSCE JOURNAL OF CIVIL ENGINEERING, 2020, 24 (07) : 2173 - 2185
  • [7] Neural networks for model-based control
    Tronci, S
    Servida, A
    Baratti, R
    [J]. INTELLIGENT CONTROL SYSTEMS AND SIGNAL PROCESSING 2003, 2003, : 177 - 182
  • [8] Neural networks for model-based prognostics
    Jaw, Link C.
    [J]. IEEE Aerospace Applications Conference Proceedings, 1999, 3 : 21 - 28
  • [9] Unlabeled PCA-shuffling initialization for convolutional neural networks
    Ou, Jun
    Li, Yujian
    Shen, Chengkai
    [J]. APPLIED INTELLIGENCE, 2018, 48 (12) : 4565 - 4576
  • [10] Power-law initialization algorithm for convolutional neural networks
    Kaiwen Jiang
    Jian Liu
    Tongtong Xing
    Shujing Li
    Shunyao Wu
    Fengjing Shao
    Rencheng Sun
    [J]. Neural Computing and Applications, 2023, 35 : 22431 - 22447