Fully Dynamic Inference With Deep Neural Networks

被引:22
|
作者
Xia, Wenhan [1 ]
Yin, Hongxu [2 ]
Dai, Xiaoliang [3 ]
Jha, Niraj K. [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08540 USA
[2] NVIDIA, Santa Clara, CA 95050 USA
[3] Facebook, Mobile Comp Vis Team, Menlo Pk, CA 94025 USA
基金
美国国家科学基金会;
关键词
Conditional computation; deep learning; dynamic execution; dynamic inference; model compression;
D O I
10.1109/TETC.2021.3056031
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9x fewer floating-point operations (FLOPs) and up to 3.3 percent higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4x fewer FLOPs and up to 4.6 percent higher Top-1 accuracy than the other methods.
引用
收藏
页码:962 / 972
页数:11
相关论文
共 50 条
  • [21] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
  • [22] Mesoscopic Facial Geometry Inference Using Deep Neural Networks
    Huynh, Loc
    Chen, Weikai
    Saito, Shunsuke
    Xing, Jun
    Nagano, Koki
    Jones, Andrew
    Debevec, Paul
    Li, Hao
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8407 - 8416
  • [23] Photorealistic Facial Texture Inference Using Deep Neural Networks
    Saito, Shunsuke
    Wei, Lingyu
    Hu, Liwen
    Nagano, Koki
    Li, Hao
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2326 - 2335
  • [24] Fast inference of deep neural networks in FPGAs for particle physics
    Duarte, J.
    Han, S.
    Harris, P.
    Jindariani, S.
    Kreinar, E.
    Kreis, B.
    Ngadiuba, J.
    Pierini, M.
    Rivera, R.
    Tran, N.
    Wu, Z.
    JOURNAL OF INSTRUMENTATION, 2018, 13
  • [25] Redundant feature pruning for accelerated inference in deep neural networks
    Ayinde, Babajide O.
    Inanc, Tamer
    Zurada, Jacek M.
    NEURAL NETWORKS, 2019, 118 : 148 - 158
  • [26] ACCURATE AND EFFICIENT FIXED POINT INFERENCE FOR DEEP NEURAL NETWORKS
    Rajagopal, Vasanthakumar
    Ramasamy, Chandra Kumar
    Vishnoi, Ashok
    Gadde, Raj Narayana
    Miniskar, Narasinga Rao
    Pasupuleti, Sirish Kumar
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1847 - 1851
  • [27] DeepDyve: Dynamic Verification for Deep Neural Networks
    Li, Yu
    Li, Min
    Luo, Bo
    Tian, Ye
    Xu, Qiang
    CCS '20: PROCEEDINGS OF THE 2020 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2020, : 101 - 112
  • [28] Fully automatic wound segmentation with deep convolutional neural networks
    Chuanbo Wang
    D. M. Anisuzzaman
    Victor Williamson
    Mrinal Kanti Dhar
    Behrouz Rostami
    Jeffrey Niezgoda
    Sandeep Gopalakrishnan
    Zeyun Yu
    Scientific Reports, 10
  • [29] Fully automatic wound segmentation with deep convolutional neural networks
    Wang, Chuanbo
    Anisuzzaman, D. M.
    Williamson, Victor
    Dhar, Mrinal Kanti
    Rostami, Behrouz
    Niezgoda, Jeffrey
    Gopalakrishnan, Sandeep
    Yu, Zeyun
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [30] Fully Automatic Karyotyping via Deep Convolutional Neural Networks
    Wang, Chengyu
    Yu, Limin
    Su, Jionglong
    Shen, Juming
    Selis, Valerio
    Yang, Chunxiao
    Ma, Fei
    IEEE ACCESS, 2024, 12 : 46081 - 46092