Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引：0

作者：

Lee, Sunwoo ^{[1
]}

Agrawal, Ankit ^{[1
]}

Balaprakash, Prasanna ^{[2
]}

Choudhary, Alok ^{[1
]}

Liao, Wei-keng ^{[1
]}

机构：

[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA

[2] Argonne Natl Lab, Lemont, IL USA

来源：

PROCEEDINGS OF 2018 IEEE/ACM MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC 2018) | 2018年

关键词：

Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;

D O I：

10.1109/MLHPC.2018.000-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.

引用

页码：47 / 56

页数：10

共 50 条

[41] Efficient training for the hybrid optical diffractive deep neural network
Fang, Tao
Lia, Jingwei
Wu, Tongyu
Cheng, Ming
Dong, Xiaowen
AI AND OPTICAL DATA SCIENCES III, 2022, 12019
[42] Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Mostafa, Hesham
Wang, Xin
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[43] DGS: Communication-Efficient Graph Sampling for Distributed GNN Training
Wan, Xinchen
Chen, Kai
Zhang, Yiming
2022 IEEE 30TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP 2022), 2022,
[44] Communication-Efficient Learning of Deep Networks from Decentralized Data
McMahan, H. Brendan
Moore, Eider
Ramage, Daniel
Hampson, Seth
Aguera y Arcas, Blaise
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 1273 - 1282
[45] Communication-Efficient Federated DNN Training: Convert, Compress, Correct
Chen, Zhong-Jing
Hernandez, Eduin E.
Huang, Yu-Chih
Rini, Stefano
IEEE Internet of Things Journal, 2024, 11 (24) : 40431 - 40447
[46] Communication-Efficient Distributed Deep Metric Learning with Hybrid Synchronization
Su, Yuxin
Lyu, Michael
King, Irwin
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1463 - 1472
[47] CGX: Adaptive System Support for Communication-Efficient Deep Learning
Markov, Ilia
Ramezanikebrya, Hamidreza
Alistarh, Dan
PROCEEDINGS OF THE TWENTY-THIRD ACM/IFIP INTERNATIONAL MIDDLEWARE CONFERENCE, MIDDLEWARE 2022, 2022, : 241 - 254
[48] Automatic Delineation Strategy for Brain Metastases Using Deep Convolutional Neural Network
Liu, Y.
Stojadinovic, S.
Hrycushko, B.
Wardak, Z.
Lu, W.
Yan, Y.
Jiang, S.
Zhen, X.
Timmerman, R.
Abdulrahman, R.
Nedzi, L.
Gu, X.
MEDICAL PHYSICS, 2017, 44 (06) : 3009 - 3010
[49] Deep Convolutional Neural Network Compression based on the Intrinsic Dimension of the Training Data
Hadi, Abir Mohammad
Won, Kwanghee
APPLIED COMPUTING REVIEW, 2024, 24 (01): : 14 - 23
[50] FxpNet: Training a Deep Convolutional Neural Network in Fixed-Point Representation
Chen, Xi
Hu, Xiaolin
Zhou, Hucheng
Xu, Ningyi
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2494 - 2501

← 1 2 3 4 5 →