A Fast Compressed Hardware Architecture for Deep Neural Networks

被引:0
|
作者
Ansari, Anaam [1 ]
Shelton, Allen [1 ]
Ogunfunmi, Tokunbo [1 ]
Panchbhaiyye, Vineet [1 ]
机构
[1] Santa Clara Univ, Dept Elect & Comp Engn, Santa Clara, CA 95053 USA
关键词
deep learning; convolutional neural network; hardware architecture; design methodology; edge intelligence; channel pruning; compressed network;
D O I
10.1109/ISCAS48785.2022.9937651
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Hardware acceleration of Deep Neural Networks (DNNs) is very critical to many edge applications. The acceleration solutions available today are typically for GPU, CPU, FPGA and ASIC platforms. The Single Partial Product 2-D Convolution known as SPP2D is a hardware architecture for fast 2-D convolution which can be used for implementing a convolutional neural network (CNN). The SPP2D based CNNs prevent the re-fetching of input pixels for the calculation of partial products and it computes the output for any input size and kernel with low latency and low power consumption compared to some other popular techniques. SPP2D based VGGNet-16 rivals the performance of other existing implementations as well as a FIFO based CNN architecture which takes a novel approach to compute convolution results using row-wise inputs as opposed to traditional tile-based processing. In this paper, we present an SPP2D based hardware accelerator for a channel compressed network. We find that the low power SPP2D implementation gives a better performance in terms of low power and low execution time compared to other compressed contemporary designs. A compressed network requires less on-chip memory thus reducing the most power consuming task of moving data from off-chip to on-chip. This results in a considerable reduction in power consumption due to the reduction in memory traffic. Channelpruned SPP2D accelerator is a low power design of 298 mW which is about 0.01x to 0.37x of the existing works while also having a low execution time of 0.9 secs.
引用
收藏
页码:370 / 374
页数:5
相关论文
共 50 条
  • [1] An Efficient and Fast Softmax Hardware Architecture (EFSHA) for Deep Neural Networks
    Hussain, Muhammad Awais
    Tsai, Tsung-Han
    [J]. 2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
  • [2] Hardware Architecture Exploration for Deep Neural Networks
    Zheng, Wenqi
    Zhao, Yangyi
    Chen, Yunfan
    Park, Jinhong
    Shin, Hyunchul
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (10) : 9703 - 9712
  • [3] Hardware Architecture Exploration for Deep Neural Networks
    Wenqi Zheng
    Yangyi Zhao
    Yunfan Chen
    Jinhong Park
    Hyunchul Shin
    [J]. Arabian Journal for Science and Engineering, 2021, 46 : 9703 - 9712
  • [4] Efficient Softmax Hardware Architecture for Deep Neural Networks
    Du, Gaoming
    Tian, Chao
    Li, Zhenmin
    Zhang, Duoli
    Yin, Yongsheng
    Ouyang, Yiming
    [J]. GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 75 - 80
  • [5] SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks
    Tran, Thi Diem
    Nakashima, Yasuhiko
    [J]. IEICE TRANSACTIONS ON ELECTRONICS, 2021, E104C (07) : 319 - 329
  • [6] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
    Xiao, Hao
    Zhao, Kaikai
    Liu, Guangzhu
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 772 - 775
  • [7] FLASH: Fast Neural Architecture Search with Hardware Optimization
    Li, Guihong
    Mandal, Sumit K.
    Ogras, Umit Y.
    Marculescu, Radu
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [8] Fast Hardware-Aware Neural Architecture Search
    Zhang, Li Lyna
    Yang, Yuqing
    Jiang, Yuhang
    Zhu, Wenwu
    Liu, Yunxin
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2959 - 2967
  • [9] CoNNa-Hardware accelerator for compressed convolutional neural networks
    Struharik, Rastislav J. R.
    Vukobratovic, Bogdan Z.
    Erdeljan, Andrea M.
    Rakanovic, Damjan M.
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 73
  • [10] Eager Pruning: Algorithm and Architecture Support for Fast Training of Deep Neural Networks
    Zhang, Jiaqi
    Chen, Xiangru
    Song, Mingcong
    Li, Tao
    [J]. PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 292 - 303