Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks

被引:0
|
作者
Hacene, Ghouthi Boukli [1 ,2 ]
Gripon, Vincent [2 ]
Arzel, Matthieu [2 ]
Farrugia, Nicolas [2 ]
Bengio, Yoshua [1 ]
机构
[1] IMT Atlantique, Lab STICC, Nantes, France
[2] Univ Montreal, MILA, Montreal, PQ, Canada
关键词
D O I
10.1109/newcas49341.2020.9159769
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep Neural Networks (DNNs) in general and Convolutional Neural Networks (CNNs) in particular are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, by replacing the complex convolutional operation by a low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 and SVHN datasets and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints compared to the baselines. We also propose an efficient hardware architecture, implemented on Field Programmable Gate Arrays (FPGAs), to accelerate inference, which works as a pipeline and accommodates multiple layers working at the same time to speed up the inference process. In contrast with most proposed approaches which have used external memory or software defined memory controllers, our work is based on algorithmic optimization and full-hardware design, enabling a direct, on-chip memory implementation of a DNN while keeping close to state of the art accuracy.
引用
收藏
页码:206 / 209
页数:4
相关论文
共 50 条
  • [1] Quantized Deep Neural Networks for Energy Efficient Hardware-based Inference
    Ding, Ruizhou
    Liu, Zeye
    Blanton, R. D.
    Marculescu, Diana
    [J]. 2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2018, : 1 - 8
  • [2] Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
    Boo, Yoonho
    Sung, Wonyong
    [J]. 2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
  • [3] A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
    Sreehari, R.
    Deepu, Vijayasenan
    Arulalan, M. R.
    [J]. EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 1079 - 1091
  • [4] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    [J]. PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [5] Automatic Pruning for Quantized Neural Networks
    Guerra, Luis
    Drummond, Tom
    [J]. 2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 290 - 297
  • [6] Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Dai, Wenrui
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    [J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 14 - 17
  • [7] Holistic Filter Pruning for Efficient Deep Neural Networks
    Enderich, Lukas
    Timm, Fabian
    Burgard, Wolfram
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2595 - 2604
  • [8] ELECTRONIC HARDWARE IMPLEMENTATIONS OF NEURAL NETWORKS
    THAKOOR, AP
    MOOPENN, A
    LAMBE, J
    KHANNA, SK
    [J]. APPLIED OPTICS, 1987, 26 (23) : 5085 - 5092
  • [9] TRP: Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 977 - 983
  • [10] Efficient Softmax Hardware Architecture for Deep Neural Networks
    Du, Gaoming
    Tian, Chao
    Li, Zhenmin
    Zhang, Duoli
    Yin, Yongsheng
    Ouyang, Yiming
    [J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 75 - 80