Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

被引:107
|
作者
Rhu, Minsoo [1 ]
O'Connor, Mike [2 ]
Chatterjee, Niladrish [2 ]
Pool, Jeff [2 ]
Kwon, Youngeun [1 ]
Keckler, Stephen W. [2 ]
机构
[1] POSTECH, Pohang, South Korea
[2] NVIDIA, Santa Clara, CA USA
关键词
D O I
10.1109/HPCA.2018.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform DNN computations. We introduce a high-performance virtualization strategy based on a "compressing DMA engine" (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 53% (maximum 79%) when evaluated on an NVIDIA Titan Xp.
引用
收藏
页码:78 / 91
页数:14
相关论文
共 50 条
  • [41] Training deep quantum neural networks
    Kerstin Beer
    Dmytro Bondarenko
    Terry Farrelly
    Tobias J. Osborne
    Robert Salzmann
    Daniel Scheiermann
    Ramona Wolf
    Nature Communications, 11
  • [42] NOISY TRAINING FOR DEEP NEURAL NETWORKS
    Meng, Xiangtao
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 16 - 20
  • [43] Activation Ensembles for Deep Neural Networks
    Klabjan, Diego
    Harmon, Mark
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 206 - 214
  • [44] SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks
    Yin, Ruokai
    Moitra, Abhishek
    Bhattacharjee, Abhiroop
    Kim, Youngeun
    Panda, Priyadarshini
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (06) : 1926 - 1938
  • [45] Exploiting Sparsity in Pruned Neural Networks to Optimize Large Model Training
    Singh, Siddharth
    Bhatele, Abhinav
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS, 2023, : 245 - 255
  • [46] SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training
    Dai, Pengcheng
    Yang, Jianlei
    Ye, Xucheng
    Cheng, Xingzhou
    Luo, Junyu
    Song, Linghao
    Chen, Yiran
    Zhao, Weisheng
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [47] Feature flow regularization: Improving structured sparsity in deep neural networks
    Wu, Yue
    Lan, Yuan
    Zhang, Luchan
    Xiang, Yang
    NEURAL NETWORKS, 2023, 161 : 598 - 613
  • [48] JOINT OPTIMIZATION OF QUANTIZATION AND STRUCTURED SPARSITY FOR COMPRESSED DEEP NEURAL NETWORKS
    Srivastava, Gaurav
    Kadetotad, Deepak
    Yin, Shihui
    Berisha, Visar
    Chakrabarti, Chaitali
    Seo, Jae-sun
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1393 - 1397
  • [49] Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks
    Chunshan Li
    Qing Du
    Xiaofei Xu
    Jinhui Zhu
    Dianhui Chu
    Mobile Networks and Applications, 2021, 26 : 104 - 113
  • [50] Sparsity Turns Adversarial: Energy and Latency Attacks on Deep Neural Networks
    Krithivasan, Sarada
    Sen, Sanchari
    Raghunathan, Anand
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 4129 - 4141