The reusability prior: comparing deep learning models without training

被引：1

作者：

Polat, Aydin Goze ^{[1
]}

Alpaslan, Ferda Nur ^{[1
]}

机构：

[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

来源：

MACHINE LEARNING-SCIENCE AND TECHNOLOGY | 2023年 / 4卷 / 02期

关键词：

entropy; deep learning; parameter efficiency; reusability; NEURAL-NETWORKS;

D O I：

10.1088/2632-2153/acc713

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Various choices can affect the performance of deep learning models. We conjecture that differences in the number of contexts for model components during training are critical. We generalize this notion by defining the reusability prior as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices, but also due to the model design itself. We focus on the design aspect and introduce a graph-based methodology to estimate the number of contexts for each learnable parameter. This allows a comparison of models without requiring any training. We provide supporting evidence with experiments using cross-layer parameter sharing on CIFAR-10, CIFAR-100, and Imagenet-1K benchmarks. We give examples of models that share parameters outperforming baselines that have at least 60% more parameters. The graph-analysis-based quantities we introduced for the reusability prior align well with the results, including at least two important edge cases. We conclude that the reusability prior provides a viable research direction for model analysis based on a very simple idea: counting the number of contexts for model parameters.

引用

页数：17

共 50 条

[1] Training deep learning based image denoisers from undersampled measurements without ground truth and without image prior
Zhussip, Magauiya
Soltanayev, Shakarim
Chun, Se Young
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10247 - 10256
[2] Exploring new strategies for comparing deep learning models
Butler, Samantha J.
Price, Stanton R.
Hadia, Xian Mae D.
Price, Steven R.
Carley, Samantha C.
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS V, 2023, 12538
[3] A Deep Learning Method for Comparing Bayesian Hierarchical Models
Elsemueller, Lasse
Schnuerch, Martin
Buerkner, Paul-Christian
Radev, Stefan T.
PSYCHOLOGICAL METHODS, 2024,
[4] Continuous Training and Deployment of Deep Learning Models
Prapas, Ioannis
Derakhshan, Behrouz
Mahdiraji, Alireza Rezaei
Markl, Volker
Datenbank-Spektrum, 2021, 21 (03) : 203 - 212
[5] Towards Training Reproducible Deep Learning Models
Chen, Boyuan
Wen, Mingzhi
Shi, Yong
Lin, Dayi
Rajbahadur, Gopi Krishnan
Jiang, Zhen Ming
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 2202 - 2214
[6] Tensor Normal Training for Deep Learning Models
Ren, Yi
Goldfarb, Donald
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[7] Within Reach? Learning to touch objects without prior models
de La Bourdonnaye, Francois
Teuliere, Celine
Chateau, Thierry
Triesch, Jochen
2019 JOINT IEEE 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2019, : 93 - 98
[8] Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts
Botelho, Anthony F.
Prihar, Ethan
Heffernan, Neil T.
ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, 2022, 13355 : 281 - 293
[9] Comparing Deep Learning Models for Image Classification in Urban Flooding
Goncalves, Andre
Resende, Luis
Conci, Aura
2024 31ST INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING, IWSSIP 2024, 2024,
[10] Evolution and Role of Optimizers in Training Deep Learning Models
XiaoHao Wen
MengChu Zhou
IEEE/CAAJournalofAutomaticaSinica, 2024, 11 (10) : 2039 - 2042

← 1 2 3 4 5 →