Fully Dynamic Inference With Deep Neural Networks

被引：22

作者：

Xia, Wenhan ^{[1
]}

Yin, Hongxu ^{[2
]}

Dai, Xiaoliang ^{[3
]}

Jha, Niraj K. ^{[1
]}

机构：

[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08540 USA

[2] NVIDIA, Santa Clara, CA 95050 USA

[3] Facebook, Mobile Comp Vis Team, Menlo Pk, CA 94025 USA

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING | 2022年 / 10卷 / 02期

基金：

美国国家科学基金会;

关键词：

Conditional computation; deep learning; dynamic execution; dynamic inference; model compression;

D O I：

10.1109/TETC.2021.3056031

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9x fewer floating-point operations (FLOPs) and up to 3.3 percent higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4x fewer FLOPs and up to 4.6 percent higher Top-1 accuracy than the other methods.

引用

页码：962 / 972

页数：11

共 50 条

[21] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
Demirci, Gunduz Vehbi
Ferhatosmanoglu, Hakan
PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
[22] Mesoscopic Facial Geometry Inference Using Deep Neural Networks
Huynh, Loc
Chen, Weikai
Saito, Shunsuke
Xing, Jun
Nagano, Koki
Jones, Andrew
Debevec, Paul
Li, Hao
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8407 - 8416
[23] Photorealistic Facial Texture Inference Using Deep Neural Networks
Saito, Shunsuke
Wei, Lingyu
Hu, Liwen
Nagano, Koki
Li, Hao
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2326 - 2335
[24] Fast inference of deep neural networks in FPGAs for particle physics
Duarte, J.
Han, S.
Harris, P.
Jindariani, S.
Kreinar, E.
Kreis, B.
Ngadiuba, J.
Pierini, M.
Rivera, R.
Tran, N.
Wu, Z.
JOURNAL OF INSTRUMENTATION, 2018, 13
[25] Redundant feature pruning for accelerated inference in deep neural networks
Ayinde, Babajide O.
Inanc, Tamer
Zurada, Jacek M.
NEURAL NETWORKS, 2019, 118 : 148 - 158
[26] ACCURATE AND EFFICIENT FIXED POINT INFERENCE FOR DEEP NEURAL NETWORKS
Rajagopal, Vasanthakumar
Ramasamy, Chandra Kumar
Vishnoi, Ashok
Gadde, Raj Narayana
Miniskar, Narasinga Rao
Pasupuleti, Sirish Kumar
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1847 - 1851
[27] DeepDyve: Dynamic Verification for Deep Neural Networks
Li, Yu
Li, Min
Luo, Bo
Tian, Ye
Xu, Qiang
CCS '20: PROCEEDINGS OF THE 2020 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2020, : 101 - 112
[28] Fully automatic wound segmentation with deep convolutional neural networks
Chuanbo Wang
D. M. Anisuzzaman
Victor Williamson
Mrinal Kanti Dhar
Behrouz Rostami
Jeffrey Niezgoda
Sandeep Gopalakrishnan
Zeyun Yu
Scientific Reports, 10
[29] Fully automatic wound segmentation with deep convolutional neural networks
Wang, Chuanbo
Anisuzzaman, D. M.
Williamson, Victor
Dhar, Mrinal Kanti
Rostami, Behrouz
Niezgoda, Jeffrey
Gopalakrishnan, Sandeep
Yu, Zeyun
SCIENTIFIC REPORTS, 2020, 10 (01)
[30] Fully Automatic Karyotyping via Deep Convolutional Neural Networks
Wang, Chengyu
Yu, Limin
Su, Jionglong
Shen, Juming
Selis, Valerio
Yang, Chunxiao
Ma, Fei
IEEE ACCESS, 2024, 12 : 46081 - 46092

← 1 2 3 4 5 →