Fully Dynamic Inference With Deep Neural Networks

被引：16

作者：

Xia, Wenhan ^{[1
]}

Yin, Hongxu ^{[2
]}

Dai, Xiaoliang ^{[3
]}

Jha, Niraj K. ^{[1
]}

机构：

[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08540 USA

[2] NVIDIA, Santa Clara, CA 95050 USA

[3] Facebook, Mobile Comp Vis Team, Menlo Pk, CA 94025 USA

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING | 2022年 / 10卷 / 02期

基金：

美国国家科学基金会;

关键词：

Conditional computation; deep learning; dynamic execution; dynamic inference; model compression;

D O I：

10.1109/TETC.2021.3056031

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9x fewer floating-point operations (FLOPs) and up to 3.3 percent higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4x fewer FLOPs and up to 4.6 percent higher Top-1 accuracy than the other methods.

引用

页码：962 / 972

页数：11

共 50 条

[1] Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks
Zhao, Shize
He, Liulu
Xie, Xiaoru
Lin, Jun
Wang, Zhongfeng
2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 117 - 122
[2] Bayesian inference and forecasting in dynamic neural networks with fully Markov switching ARCH noises
Spezia, Luigi
Paroli, Roberta
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2008, 37 (13) : 2079 - 2094
[3] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
Farrell, Max H.
Liang, Tengyuan
Misra, Sanjog
ECONOMETRICA, 2021, 89 (01) : 181 - 213
[4] Property Inference for Deep Neural Networks
Gopinath, Divya
Converse, Hayes
Pasareanu, Corina S.
Taly, Ankur
34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
[5] Dynamic Representations Toward Efficient Inference on Deep Neural Networks by Decision Gates
Shafiee, Mohammad Saeed
Shafiee, Mohammad Javad
Wong, Alexander
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 677 - 685
[6] Scaling for edge inference of deep neural networks
Xiaowei Xu
Yukun Ding
Sharon Xiaobo Hu
Michael Niemier
Jason Cong
Yu Hu
Yiyu Shi
Nature Electronics, 2018, 1 : 216 - 222
[7] Scaling for edge inference of deep neural networks
Xu, Xiaowei
Ding, Yukun
Hu, Sharon Xiaobo
Niemier, Michael
Cong, Jason
Hu, Yu
Shi, Yiyu
NATURE ELECTRONICS, 2018, 1 (04): : 216 - 222
[8] Secure and Verifiable Inference in Deep Neural Networks
Xu, Guowen
Li, Hongwei
Ren, Hao
Sun, Jianfei
Xu, Shengmin
Ning, Jianting
Yang, Haomiao
Yang, Kan
Deng, Robert H.
36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020), 2020, : 784 - 797
[9] xDNN: Inference for Deep Convolutional Neural Networks
D'Alberto, Paolo
Wu, Victor
Ng, Aaron
Nimaiyar, Rahul
Delaye, Elliott
Sirasao, Ashish
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (02)
[10] Variational Inference for Infinitely Deep Neural Networks
Nazaret, Achille
Blei, David
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →