Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引：0

作者：

Lee, Sunwoo ^{[1
]}

Agrawal, Ankit ^{[1
]}

Balaprakash, Prasanna ^{[2
]}

Choudhary, Alok ^{[1
]}

Liao, Wei-keng ^{[1
]}

机构：

[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA

[2] Argonne Natl Lab, Lemont, IL USA

来源：

PROCEEDINGS OF 2018 IEEE/ACM MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC 2018) | 2018年

关键词：

Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;

D O I：

10.1109/MLHPC.2018.000-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.

引用

页码：47 / 56

页数：10

共 50 条

[1] A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
Liu, Mingrui
Zhuang, Zhenxun
Lei, Yunwen
Liao, Chunyang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[2] Communication-efficient federated learning with stagewise training strategy
Cheng, Yifei
Shen, Shuheng
Liang, Xianfeng
Liu, Jingchang
Chen, Joya
Zhang, Tie
Chen, Enhong
NEURAL NETWORKS, 2023, 167 : 460 - 472
[3] Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy
Zeng, Zihao
Liu, Chubo
Tang, Zhuo
Chang, Wanli
Li, Kenli
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1165 - 1170
[4] Privacy-preserving and Communication-efficient Convolutional Neural Network Prediction Framework in Mobile Cloud Computing
Bai, Yanan
Feng, Yong
Wul, Wenyuan
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4345 - 4363
[5] Parallel Deep Convolutional Neural Network Training by Exploiting the Overlapping of Computation and Communication
Lee, Sunwoo
Jha, Dipendra
Agrawal, Ankit
Choudhary, Alok
Liao, Wei-keng
2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 183 - 192
[6] Communication-Efficient Quantized SGD for Learning Polynomial Neural Network
Yang, Zhanpeng
Zhou, Yong
Wu, Youlong
Shi, Yuanming
2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
[7] A Communication-Efficient Model of Sparse Neural Network for Distributed Intelligence
Sheng, Yiqiang
Wang, Jinlin
Zhao, Zhenyu
2016 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2016,
[8] Upgrade of Deep Neural Network-based Optical Monitors by Communication-Efficient Federated Learning
Tanimura, Takahito
Takase, Masayuki
2023 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION, OFC, 2023,
[9] Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
Guo, Zhishuai
Liu, Mingrui
Yuan, Zhuoning
Shen, Li
Liu, Wei
Yang, Tianbao
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[10] Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
Guo, Zhishuai
Liu, Mingrui
Yuan, Zhuoning
Shen, Li
Liu, Wei
Yang, Tianbao
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,

← 1 2 3 4 5 →