Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引：0

作者：

Lee, Sunwoo ^{[1
]}

Agrawal, Ankit ^{[1
]}

Balaprakash, Prasanna ^{[2
]}

Choudhary, Alok ^{[1
]}

Liao, Wei-keng ^{[1
]}

机构：

[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA

[2] Argonne Natl Lab, Lemont, IL USA

来源：

PROCEEDINGS OF 2018 IEEE/ACM MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC 2018) | 2018年

关键词：

Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;

D O I：

10.1109/MLHPC.2018.000-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.

引用

页码：47 / 56

页数：10

共 50 条

[21] Efficient Learning Rate Adaptation for Convolutional Neural Network Training
Georgakopoulos, Spiros V.
Plagianakos, Vassilis P.
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[22] FedAGL: A Communication-Efficient Federated Vehicular Network
Liu, Su
Li, Yushuai
Guan, Peiyuan
Li, Tianyi
Yu, Jiong
Taherkordi, Amir
Jensen, Christian S.
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3704 - 3720
[23] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
Choi, Seungkyu
Sim, Jaehyeong
Kang, Myeonggu
Choi, Yeongjae
Kim, Hyeonuk
Kim, Lee-Sup
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702
[24] Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space
Abrishami, Mohammad Saeed
Eshratifar, Amir Erfan
Eigen, David
Wang, Yanzhi
Nazarian, Shahin
Pedram, Massoud
PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 347 - 351
[25] Communication-Efficient Federated Learning With Binary Neural Networks
Yang, Yuzhi
Zhang, Zhaoyang
Yang, Qianqian
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (12) : 3836 - 3850
[26] Deep Convolutional Neural Network
Zhou, Yu
Fang, Rui
Liu, Peng
Liu, Kai
2019 PROCEEDINGS OF THE CONFERENCE ON CONTROL AND ITS APPLICATIONS, CT, 2019, : 46 - 51
[27] Communication-Efficient Stochastic Gradient MCMC for Neural Networks
Li, Chunyuan
Chen, Changyou
Pu, Yunchen
Henao, Ricardo
Carin, Lawrence
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4173 - 4180
[28] Robust feature space separation for deep convolutional neural network training
Sekmen A.
Parlaktuna M.
Abdul-Malek A.
Erdemir E.
Koku A.B.
Discover Artificial Intelligence, 2021, 1 (01):
[29] FPGA Based Reconfigurable Coprocessor for Deep Convolutional Neural Network Training
Clere, Sajna Remi
Sachin, S.
Varghese, Kuruvilla
2018 21ST EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2018), 2018, : 381 - 388
[30] Intelligent Beam Training with Deep Convolutional Neural Network in mmWave Communications
Wang, Zicun
Shan, Wenxing
Zhang, Lin
Ma, Song
Xiao, Ming
Wei, Ning
Li, Shaoqian
2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1223 - 1228

← 1 2 3 4 5 →