SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks

被引:647
|
作者
Parashar, Angshuman [1 ]
Rhu, Minsoo [1 ]
Mukkara, Anurag [2 ]
Puglielli, Antonio [3 ]
Venkatesan, Rangharajan [1 ]
Khailany, Brucek [1 ]
Emer, Joel [1 ,2 ]
Keckler, Stephen W. [1 ]
Dally, William J. [1 ,4 ]
机构
[1] NVIDIA, Santa Clara, CA 95051 USA
[2] MIT, Cambridge, MA 02139 USA
[3] Univ Calif Berkeley, Berkeley, CA USA
[4] Stanford Univ, Stanford, CA 94305 USA
关键词
Convolutional neural networks; accelerator architecture;
D O I
10.1145/3079856.3080254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional Neural Networks (CNNs) have emerged as a fundamental technology for machine learning. High performance and extreme energy efficiency are critical for deployments of CNNs, especially in mobile platforms such as autonomous vehicles, cameras, and electronic personal assistants. This paper introduces the Sparse CNN (SCNN) accelerator architecture, which improves performance and energy efficiency by exploiting the zero-valued weights that stem from network pruning during training and zero-valued activations that arise from the common ReLU operator. Specifically, SCNN employs a novel dataflow that enables maintaining the sparse weights and activations in a compressed encoding, which eliminates unnecessary data transfers and reduces storage requirements. Furthermore, the SCNN dataflow facilitates efficient delivery of those weights and activations to a multiplier array, where they are extensively reused; product accumulation is performed in a novel accumulator array. On contemporary neural networks, SCNN can improve both performance and energy by a factor of 2.7x and 2.3x, respectively, over a comparably provisioned dense CNN accelerator.
引用
收藏
页码:27 / 40
页数:14
相关论文
共 50 条
  • [1] Supporting Compressed-Sparse Activations and Weights on SIMD-like Accelerator for Sparse Convolutional Neural Networks
    Lin, Chien-Yu
    Lai, Bo-Cheng
    [J]. 2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2018, : 105 - 110
  • [2] An Efficient Accelerator for Sparse Convolutional Neural Networks
    You, Weijie
    Wu, Chang
    [J]. 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
  • [3] SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks
    Gondimalla, Ashish
    Chesnut, Noah
    Thottethodi, Mithuna
    Vijaykumar, T. N.
    [J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 151 - 165
  • [4] CoNNa-Hardware accelerator for compressed convolutional neural networks
    Struharik, Rastislav J. R.
    Vukobratovic, Bogdan Z.
    Erdeljan, Andrea M.
    Rakanovic, Damjan M.
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 73
  • [5] An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs
    Lu, Liqiang
    Xie, Jiaming
    Huang, Ruirui
    Zhang, Jiansong
    Lin, Wei
    Liang, Yun
    [J]. 2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 17 - 25
  • [6] An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks
    Xie, Xiaoru
    Lin, Jun
    Wang, Zhongfeng
    Wei, Jinghe
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (07) : 2936 - 2949
  • [7] Search-free Accelerator for Sparse Convolutional Neural Networks
    Liu, Bosheng
    Chen, Xiaoming
    Han, Yinhe
    Wang, Ying
    Li, Jiajun
    Xu, Haobo
    Li, Xiaowei
    [J]. 2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 524 - 529
  • [8] SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators
    Yoo, Mingi
    Song, Jaeyong
    Lee, Jounghoo
    Kim, Namhyung
    Kim, Youngsok
    Lee, Jinho
    [J]. 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1 - 14
  • [9] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
    Yin, Xiaodi
    Wu, Zhipeng
    Li, Dejian
    Shen, Chongfei
    Liu, Yu
    [J]. IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
  • [10] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
    Zhu, Chaoyang
    Huang, Kejie
    Yang, Shuyuan
    Zhu, Ziqi
    Zhang, Hejia
    Shen, Haibin
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965