SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks

被引：647

作者：

Parashar, Angshuman ^{[1
]}

Rhu, Minsoo ^{[1
]}

Mukkara, Anurag ^{[2
]}

Puglielli, Antonio ^{[3
]}

Venkatesan, Rangharajan ^{[1
]}

Khailany, Brucek ^{[1
]}

Emer, Joel ^{[1
,2
]}

Keckler, Stephen W. ^{[1
]}

Dally, William J. ^{[1
,4
]}

机构：

[1] NVIDIA, Santa Clara, CA 95051 USA

[2] MIT, Cambridge, MA 02139 USA

[3] Univ Calif Berkeley, Berkeley, CA USA

[4] Stanford Univ, Stanford, CA 94305 USA

来源：

44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017) | 2017年

关键词：

Convolutional neural networks; accelerator architecture;

D O I：

10.1145/3079856.3080254

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Networks (CNNs) have emerged as a fundamental technology for machine learning. High performance and extreme energy efficiency are critical for deployments of CNNs, especially in mobile platforms such as autonomous vehicles, cameras, and electronic personal assistants. This paper introduces the Sparse CNN (SCNN) accelerator architecture, which improves performance and energy efficiency by exploiting the zero-valued weights that stem from network pruning during training and zero-valued activations that arise from the common ReLU operator. Specifically, SCNN employs a novel dataflow that enables maintaining the sparse weights and activations in a compressed encoding, which eliminates unnecessary data transfers and reduces storage requirements. Furthermore, the SCNN dataflow facilitates efficient delivery of those weights and activations to a multiplier array, where they are extensively reused; product accumulation is performed in a novel accumulator array. On contemporary neural networks, SCNN can improve both performance and energy by a factor of 2.7x and 2.3x, respectively, over a comparably provisioned dense CNN accelerator.

引用

页码：27 / 40

页数：14

共 50 条

[1] Supporting Compressed-Sparse Activations and Weights on SIMD-like Accelerator for Sparse Convolutional Neural Networks
Lin, Chien-Yu
Lai, Bo-Cheng
[J]. 2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2018, : 105 - 110
[2] An Efficient Accelerator for Sparse Convolutional Neural Networks
You, Weijie
Wu, Chang
[J]. 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
[3] SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks
Gondimalla, Ashish
Chesnut, Noah
Thottethodi, Mithuna
Vijaykumar, T. N.
[J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 151 - 165
[4] CoNNa-Hardware accelerator for compressed convolutional neural networks
Struharik, Rastislav J. R.
Vukobratovic, Bogdan Z.
Erdeljan, Andrea M.
Rakanovic, Damjan M.
[J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 73
[5] An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs
Lu, Liqiang
Xie, Jiaming
Huang, Ruirui
Zhang, Jiansong
Lin, Wei
Liang, Yun
[J]. 2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 17 - 25
[6] An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks
Xie, Xiaoru
Lin, Jun
Wang, Zhongfeng
Wei, Jinghe
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (07) : 2936 - 2949
[7] Search-free Accelerator for Sparse Convolutional Neural Networks
Liu, Bosheng
Chen, Xiaoming
Han, Yinhe
Wang, Ying
Li, Jiajun
Xu, Haobo
Li, Xiaowei
[J]. 2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 524 - 529
[8] SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators
Yoo, Mingi
Song, Jaeyong
Lee, Jounghoo
Kim, Namhyung
Kim, Youngsok
Lee, Jinho
[J]. 2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1 - 14
[9] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
Yin, Xiaodi
Wu, Zhipeng
Li, Dejian
Shen, Chongfei
Liu, Yu
[J]. IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
[10] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
Zhu, Chaoyang
Huang, Kejie
Yang, Shuyuan
Zhu, Ziqi
Zhang, Hejia
Shen, Haibin
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965

← 1 2 3 4 5 →