Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

被引：107

作者：

Rhu, Minsoo ^{[1
]}

O'Connor, Mike ^{[2
]}

Chatterjee, Niladrish ^{[2
]}

Pool, Jeff ^{[2
]}

Kwon, Youngeun ^{[1
]}

Keckler, Stephen W. ^{[2
]}

机构：

[1] POSTECH, Pohang, South Korea

[2] NVIDIA, Santa Clara, CA USA

来源：

2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA) | 2018年

关键词：

D O I：

10.1109/HPCA.2018.00017

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform DNN computations. We introduce a high-performance virtualization strategy based on a "compressing DMA engine" (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 53% (maximum 79%) when evaluated on an NVIDIA Titan Xp.

引用

页码：78 / 91

页数：14

共 50 条

[31] Progressive principle component analysis for compressing deep convolutional neural networks
Zhou, Jing
Qi, Haobo
Chen, Yu
Wang, Hansheng
NEUROCOMPUTING, 2021, 440 : 197 - 206
[32] Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI Feedback
Erak, Omar
Abou-Zeid, Hatem
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1029 - 1035
[33] Sparsity-Aware Orthogonal Initialization of Deep Neural Networks
Esguerra, Kiara
Nasir, Muneeb
Tang, Tong Boon
Tumian, Afidalina
Ho, Eric Tatt Wei
IEEE ACCESS, 2023, 11 : 74165 - 74181
[34] Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation
Fan, Zhihua
Li, Wenming
Wang, Zhen
Liu, Tianyu
Wu, Haibin
Liu, Yanhuan
Wu, Meng
Wu, Xinxin
Ye, Xiaochun
Fan, Dongrui
Sun, Ninghui
An, Xuejun
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (12) : 3253 - 3265
[35] A Survey on Leveraging Deep Neural Networks for Object Tracking
Krebs, Sebastian
Duraisamy, Bharanidhar
Flohr, Fabian
2017 IEEE 20TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2017,
[36] Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
He, Huarui
Wang, Jie
Zhang, Zhanqiu
Wu, Feng
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 534 - 544
[37] A Knee-Guided Evolutionary Algorithm for Compressing Deep Neural Networks
Zhou, Yao
Yen, Gary G.
Yi, Zhang
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1626 - 1638
[38] iDropout: Leveraging Deep Taylor Decomposition for the Robustness of Deep Neural Networks
Schreckenberger, Christian
Bartelt, Christian
Stuckenschmidt, Heiner
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2019 CONFERENCES, 2019, 11877 : 113 - 126
[39] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
Ghoshal, Arnab
Swietojanski, Pawel
Renals, Steve
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
[40] Training deep quantum neural networks
Beer, Kerstin
Bondarenko, Dmytro
Farrelly, Terry
Osborne, Tobias J.
Salzmann, Robert
Scheiermann, Daniel
Wolf, Ramona
NATURE COMMUNICATIONS, 2020, 11 (01)

← 1 2 3 4 5 →