SPARCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

被引:20
|
作者
Sen, Sanchari [1 ]
Jain, Shubham [1 ]
Venkataramani, Swagath [2 ]
Raghunathan, Anand [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47906 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
基金
美国国家科学基金会;
关键词
Deep learning; deep neural networks; sparsity; general purpose processors;
D O I
10.1109/TC.2018.2879434
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. The enormous computational demand posed by DNNs is a key challenge for computing system designers and has most commonly been addressed through the design of DNN accelerators. However, these specialized accelerators utilize large quantities of multiply-accumulate units and on-chip memory and are prohibitive in area and cost constrained systems such as wearable devices and IoT sensors. In this work, we take a complementary approach and improve the performance of DNNs on general-purpose processor (GPP) cores. We do so by exploiting a key attribute of DNNs, viz. sparsity or the prevalence of zero values. We propose Sparsity-aware Core Extensions (SPARCE)-a set of low-overhead micro-architectural and ISA extensions that dynamically detect whether an operand (e. g., the result of a load instruction) is zero and subsequently skip a set of future instructions that use it. To maximize performance benefits, SPARCE ensures that the instructions to be skipped are prevented fromeven being fetched, as squashing instructions comeswith a penalty (e. g., a pipeline stall). SPARCE consists of 2 keymicro-architectural enhancements. First, a Sparsity Register File (SpRF) is utilized to track registers that are zero. Next, a Sparsity-Aware Skip Address (SASA) Table is used to indicate instruction sequences that can be skipped, and to specify conditions on SpRF registers that trigger instruction skipping. When an instruction is fetched, SPARCE dynamically pre-identifies whether the following instruction(s) can be skipped, and if so appropriatelymodifies the program counter, thereby skipping the redundant instructions and improving performance. We model SPARCE using the gem5 architectural simulator, and evaluate our approach on 6 state-of-the-art image-recognition DNNs in the context of both training and inference using the Caffe deep learning framework. On a scalar microprocessor, SPARCE achieves 1.11x-1.96x speedups across both convolution and fully-connected layers that exhibit 10-90 percent sparsity. These speedups translate to 19-31 percent reduction in execution time at the overall application-level. We also evaluate SPARCE on a 4-way SIMD ARMv8 processor using the OpenBLAS library, and demonstrate that SPARCE achieves 8-15 percent reduction in the application-level execution time.
引用
收藏
页码:912 / 925
页数:14
相关论文
共 50 条
  • [1] Sparsity-Aware Caches to Accelerate Deep Neural Networks
    Ganesan, Vinod
    Sen, Sanchari
    Kumar, Pratyush
    Gala, Neel
    Veezhinathan, Kamakoti
    Raghunathan, Anand
    [J]. PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 85 - 90
  • [2] A General-Purpose Neural Architecture Search Algorithm for Building Deep Neural Networks
    Zito, Francesco
    Cutello, Vincenzo
    Pavone, Mario
    [J]. METAHEURISTICS, MIC 2024, PT II, 2024, 14754 : 126 - 141
  • [3] A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud
    Zeng, Shulin
    Dai, Guohao
    Sun, Hanbo
    Liu, Jun
    Li, Shiyao
    Ge, Guangjun
    Zhong, Kai
    Guo, Kaiyuan
    Wang, Yu
    Yang, Huazhong
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (03)
  • [4] Multimedia extensions for general-purpose processors
    Lee, RB
    [J]. SIPS 97 - 1997 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 1997, : 9 - 23
  • [5] POSTER: Exploiting the Input Sparsity to Accelerate Deep Neural Networks
    Dong, Xiao
    Liu, Lei
    Li, Guangli
    Li, Jiansong
    Zhao, Peng
    Wang, Xueying
    Feng, Xiaobing
    [J]. PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 401 - 402
  • [6] DESIGN OF A GENERAL-PURPOSE MIMO PREDICTOR WITH NEURAL NETWORKS
    CUI, XZ
    SHIN, KG
    [J]. JOURNAL OF INTELLIGENT MATERIAL SYSTEMS AND STRUCTURES, 1994, 5 (02) : 198 - 210
  • [7] Data-driven simulation for general-purpose multibody dynamics using Deep Neural Networks
    Hee-Sun Choi
    Junmo An
    Seongji Han
    Jin-Gyun Kim
    Jae-Yoon Jung
    Juhwan Choi
    Grzegorz Orzechowski
    Aki Mikkola
    Jin Hwan Choi
    [J]. Multibody System Dynamics, 2021, 51 : 419 - 454
  • [8] Data-driven simulation for general-purpose multibody dynamics using Deep Neural Networks
    Choi, Hee-Sun
    An, Junmo
    Han, Seongji
    Kim, Jin-Gyun
    Jung, Jae-Yoon
    Choi, Juhwan
    Orzechowski, Grzegorz
    Mikkola, Aki
    Choi, Jin Hwan
    [J]. MULTIBODY SYSTEM DYNAMICS, 2021, 51 (04) : 419 - 454
  • [9] Sparsity-aware generalization theory for deep neural networks
    Muthukumar, Ramchandran
    Sulam, Jeremias
    [J]. THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [10] Sparsity-Aware Orthogonal Initialization of Deep Neural Networks
    Esguerra, Kiara
    Nasir, Muneeb
    Tang, Tong Boon
    Tumian, Afidalina
    Ho, Eric Tatt Wei
    [J]. IEEE ACCESS, 2023, 11 : 74165 - 74181