Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

被引:107
|
作者
Rhu, Minsoo [1 ]
O'Connor, Mike [2 ]
Chatterjee, Niladrish [2 ]
Pool, Jeff [2 ]
Kwon, Youngeun [1 ]
Keckler, Stephen W. [2 ]
机构
[1] POSTECH, Pohang, South Korea
[2] NVIDIA, Santa Clara, CA USA
关键词
D O I
10.1109/HPCA.2018.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform DNN computations. We introduce a high-performance virtualization strategy based on a "compressing DMA engine" (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 53% (maximum 79%) when evaluated on an NVIDIA Titan Xp.
引用
收藏
页码:78 / 91
页数:14
相关论文
共 50 条
  • [31] Progressive principle component analysis for compressing deep convolutional neural networks
    Zhou, Jing
    Qi, Haobo
    Chen, Yu
    Wang, Hansheng
    NEUROCOMPUTING, 2021, 440 : 197 - 206
  • [32] Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI Feedback
    Erak, Omar
    Abou-Zeid, Hatem
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1029 - 1035
  • [33] Sparsity-Aware Orthogonal Initialization of Deep Neural Networks
    Esguerra, Kiara
    Nasir, Muneeb
    Tang, Tong Boon
    Tumian, Afidalina
    Ho, Eric Tatt Wei
    IEEE ACCESS, 2023, 11 : 74165 - 74181
  • [34] Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation
    Fan, Zhihua
    Li, Wenming
    Wang, Zhen
    Liu, Tianyu
    Wu, Haibin
    Liu, Yanhuan
    Wu, Meng
    Wu, Xinxin
    Ye, Xiaochun
    Fan, Dongrui
    Sun, Ninghui
    An, Xuejun
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (12) : 3253 - 3265
  • [35] A Survey on Leveraging Deep Neural Networks for Object Tracking
    Krebs, Sebastian
    Duraisamy, Bharanidhar
    Flohr, Fabian
    2017 IEEE 20TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2017,
  • [36] Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
    He, Huarui
    Wang, Jie
    Zhang, Zhanqiu
    Wu, Feng
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 534 - 544
  • [37] A Knee-Guided Evolutionary Algorithm for Compressing Deep Neural Networks
    Zhou, Yao
    Yen, Gary G.
    Yi, Zhang
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1626 - 1638
  • [38] iDropout: Leveraging Deep Taylor Decomposition for the Robustness of Deep Neural Networks
    Schreckenberger, Christian
    Bartelt, Christian
    Stuckenschmidt, Heiner
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2019 CONFERENCES, 2019, 11877 : 113 - 126
  • [39] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
    Ghoshal, Arnab
    Swietojanski, Pawel
    Renals, Steve
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
  • [40] Training deep quantum neural networks
    Beer, Kerstin
    Bondarenko, Dmytro
    Farrelly, Terry
    Osborne, Tobias J.
    Salzmann, Robert
    Scheiermann, Daniel
    Wolf, Ramona
    NATURE COMMUNICATIONS, 2020, 11 (01)