The reusability prior: comparing deep learning models without training

被引：1

作者：

Polat, Aydin Goze ^{[1
]}

Alpaslan, Ferda Nur ^{[1
]}

机构：

[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

来源：

MACHINE LEARNING-SCIENCE AND TECHNOLOGY | 2023年 / 4卷 / 02期

关键词：

entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;

D O I：

10.1088/2632-2153/acc713

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.

引用

页数：17

共 50 条

[41] Distributed Framework for Accelerating Training of Deep Learning Models through Prioritization
Zhou, Tian
Gao, Lixin
2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021, 2021, : 201 - 209
[42] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
Teng, Yunfei
Gao, Wenbo
Chalus, Francois
Choromanska, Anna
Goldfarb, Donald
Weller, Adrian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[43] Phase-Change Memory Models for Deep Learning Training and Inference
Nandakumar, S. R.
Boybat, Irem
Joshi, Vinay
Piveteau, Christophe
Le Gallo, Manuel
Rajendran, Bipin
Sebastian, Abu
Eleftheriou, Evangelos
2019 26TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2019, : 727 - 730
[44] Exploration of the Influence on Training Deep Learning Models by Watermarked Image Dataset
Liu, Shiqin
Feng, Shiyuan
Wu, Jinxia
Ren, Wei
Wang, Weiqi
Zheng, Wenwen
19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 421 - 428
[45] Standardizing and Centralizing Datasets for Efficient Training of Agricultural Deep Learning Models
Joshi, Amogh
Guevara, Dario
Earles, Mason
PLANT PHENOMICS, 2023, 5
[46] Automated code transformation for distributed training of TensorFlow deep learning models
Sim, Yusung
Shin, Wonho
Lee, Sungho
SCIENCE OF COMPUTER PROGRAMMING, 2025, 242
[47] Efficient Training of Deep Learning Models Through Improved Adaptive Sampling
Avalos-Lopez, Jorge Ivan
Rojas-Dominguez, Alfonso
Ornelas-Rodriguez, Manuel
Carpio, Martin
Valdez, S. Ivvan
PATTERN RECOGNITION (MCPR 2021), 2021, 12725 : 141 - 152
[48] Training confounder-free deep learning models for medical applications
Zhao, Qingyu
Adeli, Ehsan
Pohl, Kilian M.
NATURE COMMUNICATIONS, 2020, 11 (01)
[49] Training confounder-free deep learning models for medical applications
Qingyu Zhao
Ehsan Adeli
Kilian M. Pohl
Nature Communications, 11
[50] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
Ghadirzadeh, Ali
Poklukar, Petra
Arndt, Karol
Finn, Chelsea
Kyrki, Ville
Kragic, Danica
Björkman, Mårten
Journal of Machine Learning Research, 2022, 23

← 1 2 3 4 5 →