The reusability prior: comparing deep learning models without training

被引:1
|
作者
Polat, Aydin Goze [1 ]
Alpaslan, Ferda Nur [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye
来源
关键词
entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;
D O I
10.1088/2632-2153/acc713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Distributed Framework for Accelerating Training of Deep Learning Models through Prioritization
    Zhou, Tian
    Gao, Lixin
    2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021, 2021, : 201 - 209
  • [42] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
    Teng, Yunfei
    Gao, Wenbo
    Chalus, Francois
    Choromanska, Anna
    Goldfarb, Donald
    Weller, Adrian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [43] Phase-Change Memory Models for Deep Learning Training and Inference
    Nandakumar, S. R.
    Boybat, Irem
    Joshi, Vinay
    Piveteau, Christophe
    Le Gallo, Manuel
    Rajendran, Bipin
    Sebastian, Abu
    Eleftheriou, Evangelos
    2019 26TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2019, : 727 - 730
  • [44] Exploration of the Influence on Training Deep Learning Models by Watermarked Image Dataset
    Liu, Shiqin
    Feng, Shiyuan
    Wu, Jinxia
    Ren, Wei
    Wang, Weiqi
    Zheng, Wenwen
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 421 - 428
  • [45] Standardizing and Centralizing Datasets for Efficient Training of Agricultural Deep Learning Models
    Joshi, Amogh
    Guevara, Dario
    Earles, Mason
    PLANT PHENOMICS, 2023, 5
  • [46] Automated code transformation for distributed training of TensorFlow deep learning models
    Sim, Yusung
    Shin, Wonho
    Lee, Sungho
    SCIENCE OF COMPUTER PROGRAMMING, 2025, 242
  • [47] Efficient Training of Deep Learning Models Through Improved Adaptive Sampling
    Avalos-Lopez, Jorge Ivan
    Rojas-Dominguez, Alfonso
    Ornelas-Rodriguez, Manuel
    Carpio, Martin
    Valdez, S. Ivvan
    PATTERN RECOGNITION (MCPR 2021), 2021, 12725 : 141 - 152
  • [48] Training confounder-free deep learning models for medical applications
    Zhao, Qingyu
    Adeli, Ehsan
    Pohl, Kilian M.
    NATURE COMMUNICATIONS, 2020, 11 (01)
  • [49] Training confounder-free deep learning models for medical applications
    Qingyu Zhao
    Ehsan Adeli
    Kilian M. Pohl
    Nature Communications, 11
  • [50] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
    Ghadirzadeh, Ali
    Poklukar, Petra
    Arndt, Karol
    Finn, Chelsea
    Kyrki, Ville
    Kragic, Danica
    Björkman, Mårten
    Journal of Machine Learning Research, 2022, 23