The reusability prior: comparing deep learning models without training

被引:1
|
作者
Polat, Aydin Goze [1 ]
Alpaslan, Ferda Nur [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
来源
关键词
entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;
D O I
10.1088/2632-2153/acc713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Analysis of Training Deep Learning Models for PCB Defect Detection
    Park, Joon-Hyung
    Kim, Yeong-Seok
    Seo, Hwi
    Cho, Yeong-Jun
    SENSORS, 2023, 23 (05)
  • [32] Undersampled magnetic resonance image reconstruction based on support prior and deep image prior without pre-training
    Zhao Di
    Zhao Li-Zhi
    Gan Yong-Jin
    Qin Bin-Yi
    ACTA PHYSICA SINICA, 2022, 71 (05)
  • [33] Training deep learning based denoisers without ground truth data
    Soltanayev, Shakarim
    Chun, Se Young
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [34] Training Deep Networks without Learning Rates Through Coin Betting
    Orabona, Francesco
    Tommasi, Tatiana
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [35] Training Models in Laparoscopy : a Systematic Review Comparing their Effectiveness in Learning Surgical Skills
    Willaert, W.
    Van de Putte, D.
    Van Renterghem, K.
    Van Nieuwenhove, Y.
    Ceelen, W.
    Pattyn, P.
    ACTA CHIRURGICA BELGICA, 2013, 113 (02) : 77 - 95
  • [36] Comparing machine learning and deep learning models to predict cognition progression in Parkinson's disease
    Bernal, Edgar A.
    Yang, Shu
    Herbst, Konnor
    Venuto, Charles S.
    CTS-CLINICAL AND TRANSLATIONAL SCIENCE, 2024, 17 (11):
  • [37] Learning without Training
    Beste, Christian
    Dinse, Hubert R.
    CURRENT BIOLOGY, 2013, 23 (11) : R489 - R499
  • [38] Learning without Training
    Smolek, Inge
    SPEKTRUM DER AUGENHEILKUNDE, 2011, 25 (03) : 240 - 240
  • [39] A deep learning method based on prior knowledge with dual training for solving FPK equation
    Peng, Denghui
    Wang, Shenlong
    Huang, Yuanchen
    CHINESE PHYSICS B, 2024, 33 (01)
  • [40] A deep learning method based on prior knowledge with dual training for solving FPK equation
    彭登辉
    王神龙
    黄元辰
    Chinese Physics B, 2024, 33 (01) : 22 - 35