The reusability prior: comparing deep learning models without training

被引:1
|
作者
Polat, Aydin Goze [1 ]
Alpaslan, Ferda Nur [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
来源
关键词
entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;
D O I
10.1088/2632-2153/acc713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Training deep learning based image denoisers from undersampled measurements without ground truth and without image prior
    Zhussip, Magauiya
    Soltanayev, Shakarim
    Chun, Se Young
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10247 - 10256
  • [2] Exploring new strategies for comparing deep learning models
    Butler, Samantha J.
    Price, Stanton R.
    Hadia, Xian Mae D.
    Price, Steven R.
    Carley, Samantha C.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS V, 2023, 12538
  • [3] A Deep Learning Method for Comparing Bayesian Hierarchical Models
    Elsemueller, Lasse
    Schnuerch, Martin
    Buerkner, Paul-Christian
    Radev, Stefan T.
    PSYCHOLOGICAL METHODS, 2024,
  • [4] Continuous Training and Deployment of Deep Learning Models
    Prapas, Ioannis
    Derakhshan, Behrouz
    Mahdiraji, Alireza Rezaei
    Markl, Volker
    Datenbank-Spektrum, 2021, 21 (03) : 203 - 212
  • [5] Towards Training Reproducible Deep Learning Models
    Chen, Boyuan
    Wen, Mingzhi
    Shi, Yong
    Lin, Dayi
    Rajbahadur, Gopi Krishnan
    Jiang, Zhen Ming
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 2202 - 2214
  • [6] Tensor Normal Training for Deep Learning Models
    Ren, Yi
    Goldfarb, Donald
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [7] Within Reach? Learning to touch objects without prior models
    de La Bourdonnaye, Francois
    Teuliere, Celine
    Chateau, Thierry
    Triesch, Jochen
    2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 93 - 98
  • [8] Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts
    Botelho, Anthony F.
    Prihar, Ethan
    Heffernan, Neil T.
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, 2022, 13355 : 281 - 293
  • [9] Comparing Deep Learning Models for Image Classification in Urban Flooding
    Goncalves, Andre
    Resende, Luis
    Conci, Aura
    2024 31ST INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING, IWSSIP 2024, 2024,
  • [10] Evolution and Role of Optimizers in Training Deep Learning Models
    XiaoHao Wen
    MengChu Zhou
    IEEE/CAAJournalofAutomaticaSinica, 2024, 11 (10) : 2039 - 2042