Fully Dynamic Inference With Deep Neural Networks

被引:16
|
作者
Xia, Wenhan [1 ]
Yin, Hongxu [2 ]
Dai, Xiaoliang [3 ]
Jha, Niraj K. [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08540 USA
[2] NVIDIA, Santa Clara, CA 95050 USA
[3] Facebook, Mobile Comp Vis Team, Menlo Pk, CA 94025 USA
基金
美国国家科学基金会;
关键词
Conditional computation; deep learning; dynamic execution; dynamic inference; model compression;
D O I
10.1109/TETC.2021.3056031
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, and long inference latency, which prevents their deployment in resource-constrained and time-sensitive scenarios, such as edge-side inference and self-driving cars. While recently developed methods for creating efficient deep neural networks are making their real-world deployment more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize computational efficiency and task accuracy. In particular, most existing methods typically use a one-size-fits-all approach that identically processes all inputs. Motivated by the fact that different images require different feature embeddings to be accurately classified, we propose a fully dynamic paradigm that imparts deep convolutional neural networks with hierarchical inference dynamics at the level of layers and individual convolutional filters/channels. Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. L-Net and C-Net also learn how to scale retained computation outputs to maximize task accuracy. By integrating L-Net and C-Net into a joint design framework, called LC-Net, we consistently outperform state-of-the-art dynamic frameworks with respect to both efficiency and classification accuracy. On the CIFAR-10 dataset, LC-Net results in up to 11.9x fewer floating-point operations (FLOPs) and up to 3.3 percent higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4x fewer FLOPs and up to 4.6 percent higher Top-1 accuracy than the other methods.
引用
收藏
页码:962 / 972
页数:11
相关论文
共 50 条
  • [1] Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks
    Zhao, Shize
    He, Liulu
    Xie, Xiaoru
    Lin, Jun
    Wang, Zhongfeng
    2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 117 - 122
  • [2] Bayesian inference and forecasting in dynamic neural networks with fully Markov switching ARCH noises
    Spezia, Luigi
    Paroli, Roberta
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2008, 37 (13) : 2079 - 2094
  • [3] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
    Farrell, Max H.
    Liang, Tengyuan
    Misra, Sanjog
    ECONOMETRICA, 2021, 89 (01) : 181 - 213
  • [4] Property Inference for Deep Neural Networks
    Gopinath, Divya
    Converse, Hayes
    Pasareanu, Corina S.
    Taly, Ankur
    34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
  • [5] Dynamic Representations Toward Efficient Inference on Deep Neural Networks by Decision Gates
    Shafiee, Mohammad Saeed
    Shafiee, Mohammad Javad
    Wong, Alexander
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 677 - 685
  • [6] Scaling for edge inference of deep neural networks
    Xiaowei Xu
    Yukun Ding
    Sharon Xiaobo Hu
    Michael Niemier
    Jason Cong
    Yu Hu
    Yiyu Shi
    Nature Electronics, 2018, 1 : 216 - 222
  • [7] Scaling for edge inference of deep neural networks
    Xu, Xiaowei
    Ding, Yukun
    Hu, Sharon Xiaobo
    Niemier, Michael
    Cong, Jason
    Hu, Yu
    Shi, Yiyu
    NATURE ELECTRONICS, 2018, 1 (04): : 216 - 222
  • [8] Secure and Verifiable Inference in Deep Neural Networks
    Xu, Guowen
    Li, Hongwei
    Ren, Hao
    Sun, Jianfei
    Xu, Shengmin
    Ning, Jianting
    Yang, Haomiao
    Yang, Kan
    Deng, Robert H.
    36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020), 2020, : 784 - 797
  • [9] xDNN: Inference for Deep Convolutional Neural Networks
    D'Alberto, Paolo
    Wu, Victor
    Ng, Aaron
    Nimaiyar, Rahul
    Delaye, Elliott
    Sirasao, Ashish
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (02)
  • [10] Variational Inference for Infinitely Deep Neural Networks
    Nazaret, Achille
    Blei, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,