The reusability prior: comparing deep learning models without training

被引:1
|
作者
Polat, Aydin Goze [1 ]
Alpaslan, Ferda Nur [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
来源
关键词
entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;
D O I
10.1088/2632-2153/acc713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Identifying Bikers Without Helmets Using Deep Learning Models
    Hossain, Md Iqbal
    Muhib, Raghib Barkat
    Chakrabarty, Amitabha
    2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 510 - 517
  • [22] Training Size Effect on Deep Learning Models for Geographic Atrophy
    Slater, Robert
    Banghart, Mark
    Channa, Roomasa
    Blodi, Barbara A.
    Fong, Donald S.
    Domalpally, Amitha
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [23] Fast Training of Deep Learning Models over Multiple GPUs
    Yi, Xiaodong
    Luo, Ziyue
    Meng, Chen
    Wang, Mengdi
    Long, Guoping
    Wu, Chuan
    Yang, Jun
    Lin, Wei
    PROCEEDINGS OF THE 2020 21ST INTERNATIONAL MIDDLEWARE CONFERENCE (MIDDLEWARE '20), 2020, : 105 - 118
  • [24] Fast Training Methods and Their Experiments for Deep Learning CNN Models
    Jiang, Shanshan
    Wang, Sheng-Guo
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8253 - 8260
  • [25] Reproducibility of Training Deep Learning Models for Medical Image Analysis
    Bosma, Joeran Sander
    Peeters, Dre
    Alves, Natalia
    Saha, Anindo
    Saghir, Zaigham
    Jacobs, Colin
    Huisman, Henkjan
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 1269 - 1287
  • [26] Reproducibility of Training Deep Learning Models for Medical Image Analysis
    Bosma, Joeran Sander
    Peeters, Dré
    Alves, Natália
    Saha, Anindo
    Saghir, Zaigham
    Jacobs, Colin
    Huisman, Henkjan
    Proceedings of Machine Learning Research, 2023, 227 : 1269 - 1287
  • [27] Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
    Acun, Bilge
    Murphy, Matthew
    Wang, Xiaodong
    Nie, Jade
    Wu, Carole-Jean
    Hazelwood, Kim
    2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 802 - 814
  • [28] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [29] Post-training Quantization Methods for Deep Learning Models
    Kluska, Piotr
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
  • [30] Making It Simple? Training Deep Learning Models Toward Simplicity
    Repetto, Marco
    La Torre, Davide
    2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 784 - 789