Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks

被引：0

作者：

Hacene, Ghouthi Boukli ^{[1
,2
]}

Gripon, Vincent ^{[2
]}

Arzel, Matthieu ^{[2
]}

Farrugia, Nicolas ^{[2
]}

Bengio, Yoshua ^{[1
]}

机构：

[1] IMT Atlantique, Lab STICC, Nantes, France

[2] Univ Montreal, MILA, Montreal, PQ, Canada

来源：

2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20) | 2020年

关键词：

D O I：

10.1109/newcas49341.2020.9159769

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep Neural Networks (DNNs) in general and Convolutional Neural Networks (CNNs) in particular are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, by replacing the complex convolutional operation by a low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 and SVHN datasets and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints compared to the baselines. We also propose an efficient hardware architecture, implemented on Field Programmable Gate Arrays (FPGAs), to accelerate inference, which works as a pipeline and accommodates multiple layers working at the same time to speed up the inference process. In contrast with most proposed approaches which have used external memory or software defined memory controllers, our work is based on algorithmic optimization and full-hardware design, enabling a direct, on-chip memory implementation of a DNN while keeping close to state of the art accuracy.

引用

页码：206 / 209

页数：4

共 50 条

[1] Quantized Deep Neural Networks for Energy Efficient Hardware-based Inference
Ding, Ruizhou
Liu, Zeye
Blanton, R. D.
Marculescu, Diana
2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2018, : 1 - 8
[2] Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
Boo, Yoonho
Sung, Wonyong
2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
[3] A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
Sreehari, R.
Deepu, Vijayasenan
Arulalan, M. R.
EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 1079 - 1091
[4] Hardware for Quantized Mixed-Precision Deep Neural Networks
Rios, Andres
Nava, Patricia
PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
[5] Automatic Pruning for Quantized Neural Networks
Guerra, Luis
Drummond, Tom
2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 290 - 297
[6] Trained Rank Pruning for Efficient Deep Neural Networks
Xu, Yuhui
Li, Yuxi
Zhang, Shuai
Wen, Wei
Wang, Botao
Dai, Wenrui
Qi, Yingyong
Chen, Yiran
Lin, Weiyao
Xiong, Hongkai
FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 14 - 17
[7] Holistic Filter Pruning for Efficient Deep Neural Networks
Enderich, Lukas
Timm, Fabian
Burgard, Wolfram
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2595 - 2604
[8] ELECTRONIC HARDWARE IMPLEMENTATIONS OF NEURAL NETWORKS
THAKOOR, AP
MOOPENN, A
LAMBE, J
KHANNA, SK
APPLIED OPTICS, 1987, 26 (23) : 5085 - 5092
[9] TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Xu, Yuhui
Li, Yuxi
Zhang, Shuai
Wen, Wei
Wang, Botao
Qi, Yingyong
Chen, Yiran
Lin, Weiyao
Xiong, Hongkai
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 977 - 983
[10] Efficient Softmax Hardware Architecture for Deep Neural Networks
Du, Gaoming
Tian, Chao
Li, Zhenmin
Zhang, Duoli
Yin, Yongsheng
Ouyang, Yiming
GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 75 - 80

← 1 2 3 4 5 →