A Fast Compressed Hardware Architecture for Deep Neural Networks

被引：0

作者：

Ansari, Anaam ^{[1
]}

Shelton, Allen ^{[1
]}

Ogunfunmi, Tokunbo ^{[1
]}

Panchbhaiyye, Vineet ^{[1
]}

机构：

[1] Santa Clara Univ, Dept Elect & Comp Engn, Santa Clara, CA 95053 USA

来源：

2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22) | 2022年

关键词：

deep learning; convolutional neural network; hardware architecture; design methodology; edge intelligence; channel pruning; compressed network;

D O I：

10.1109/ISCAS48785.2022.9937651

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Hardware acceleration of Deep Neural Networks (DNNs) is very critical to many edge applications. The acceleration solutions available today are typically for GPU, CPU, FPGA and ASIC platforms. The Single Partial Product 2-D Convolution known as SPP2D is a hardware architecture for fast 2-D convolution which can be used for implementing a convolutional neural network (CNN). The SPP2D based CNNs prevent the re-fetching of input pixels for the calculation of partial products and it computes the output for any input size and kernel with low latency and low power consumption compared to some other popular techniques. SPP2D based VGGNet-16 rivals the performance of other existing implementations as well as a FIFO based CNN architecture which takes a novel approach to compute convolution results using row-wise inputs as opposed to traditional tile-based processing. In this paper, we present an SPP2D based hardware accelerator for a channel compressed network. We find that the low power SPP2D implementation gives a better performance in terms of low power and low execution time compared to other compressed contemporary designs. A compressed network requires less on-chip memory thus reducing the most power consuming task of moving data from off-chip to on-chip. This results in a considerable reduction in power consumption due to the reduction in memory traffic. Channelpruned SPP2D accelerator is a low power design of 298 mW which is about 0.01x to 0.37x of the existing works while also having a low execution time of 0.9 secs.

引用

页码：370 / 374

页数：5

共 50 条

[1] An Efficient and Fast Softmax Hardware Architecture (EFSHA) for Deep Neural Networks
Hussain, Muhammad Awais
Tsai, Tsung-Han
[J]. 2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
[2] Hardware Architecture Exploration for Deep Neural Networks
Zheng, Wenqi
Zhao, Yangyi
Chen, Yunfan
Park, Jinhong
Shin, Hyunchul
[J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (10) : 9703 - 9712
[3] Hardware Architecture Exploration for Deep Neural Networks
Wenqi Zheng
Yangyi Zhao
Yunfan Chen
Jinhong Park
Hyunchul Shin
[J]. Arabian Journal for Science and Engineering, 2021, 46 : 9703 - 9712
[4] Efficient Softmax Hardware Architecture for Deep Neural Networks
Du, Gaoming
Tian, Chao
Li, Zhenmin
Zhang, Duoli
Yin, Yongsheng
Ouyang, Yiming
[J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 75 - 80
[5] SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks
Tran, Thi Diem
Nakashima, Yasuhiko
[J]. IEICE TRANSACTIONS ON ELECTRONICS, 2021, E104C (07) : 319 - 329
[6] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
Xiao, Hao
Zhao, Kaikai
Liu, Guangzhu
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 772 - 775
[7] FLASH: Fast Neural Architecture Search with Hardware Optimization
Li, Guihong
Mandal, Sumit K.
Ogras, Umit Y.
Marculescu, Radu
[J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
[8] Fast Hardware-Aware Neural Architecture Search
Zhang, Li Lyna
Yang, Yuqing
Jiang, Yuhang
Zhu, Wenwu
Liu, Yunxin
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2959 - 2967
[9] CoNNa-Hardware accelerator for compressed convolutional neural networks
Struharik, Rastislav J. R.
Vukobratovic, Bogdan Z.
Erdeljan, Andrea M.
Rakanovic, Damjan M.
[J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 73
[10] Eager Pruning: Algorithm and Architecture Support for Fast Training of Deep Neural Networks
Zhang, Jiaqi
Chen, Xiangru
Song, Mingcong
Li, Tao
[J]. PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 292 - 303

← 1 2 3 4 5 →