The reusability prior: comparing deep learning models without training

被引：1

作者：

Polat, Aydin Goze ^{[1
]}

Alpaslan, Ferda Nur ^{[1
]}

机构：

[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

来源：

MACHINE LEARNING-SCIENCE AND TECHNOLOGY | 2023年 / 4卷 / 02期

关键词：

entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;

D O I：

10.1088/2632-2153/acc713

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.

引用

页数：17

共 50 条

[21] Identifying Bikers Without Helmets Using Deep Learning Models
Hossain, Md Iqbal
Muhib, Raghib Barkat
Chakrabarty, Amitabha
2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 510 - 517
[22] Training Size Effect on Deep Learning Models for Geographic Atrophy
Slater, Robert
Banghart, Mark
Channa, Roomasa
Blodi, Barbara A.
Fong, Donald S.
Domalpally, Amitha
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
[23] Fast Training of Deep Learning Models over Multiple GPUs
Yi, Xiaodong
Luo, Ziyue
Meng, Chen
Wang, Mengdi
Long, Guoping
Wu, Chuan
Yang, Jun
Lin, Wei
PROCEEDINGS OF THE 2020 21ST INTERNATIONAL MIDDLEWARE CONFERENCE (MIDDLEWARE '20), 2020, : 105 - 118
[24] Fast Training Methods and Their Experiments for Deep Learning CNN Models
Jiang, Shanshan
Wang, Sheng-Guo
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8253 - 8260
[25] Reproducibility of Training Deep Learning Models for Medical Image Analysis
Bosma, Joeran Sander
Peeters, Dre
Alves, Natalia
Saha, Anindo
Saghir, Zaigham
Jacobs, Colin
Huisman, Henkjan
MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 1269 - 1287
[26] Reproducibility of Training Deep Learning Models for Medical Image Analysis
Bosma, Joeran Sander
Peeters, Dré
Alves, Natália
Saha, Anindo
Saghir, Zaigham
Jacobs, Colin
Huisman, Henkjan
Proceedings of Machine Learning Research, 2023, 227 : 1269 - 1287
[27] Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
Acun, Bilge
Murphy, Matthew
Wang, Xiaodong
Nie, Jade
Wu, Carole-Jean
Hazelwood, Kim
2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 802 - 814
[28] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[29] Post-training Quantization Methods for Deep Learning Models
Kluska, Piotr
Zieba, Maciej
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
[30] Making It Simple? Training Deep Learning Models Toward Simplicity
Repetto, Marco
La Torre, Davide
2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 784 - 789

← 1 2 3 4 5 →