Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引:0
|
作者
Lee, Sunwoo [1 ]
Agrawal, Ankit [1 ]
Balaprakash, Prasanna [2 ]
Choudhary, Alok [1 ]
Liao, Wei-keng [1 ]
机构
[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA
[2] Argonne Natl Lab, Lemont, IL USA
关键词
Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;
D O I
10.1109/MLHPC.2018.000-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.
引用
收藏
页码:47 / 56
页数:10
相关论文
共 50 条
  • [21] Efficient Learning Rate Adaptation for Convolutional Neural Network Training
    Georgakopoulos, Spiros V.
    Plagianakos, Vassilis P.
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [22] FedAGL: A Communication-Efficient Federated Vehicular Network
    Liu, Su
    Li, Yushuai
    Guan, Peiyuan
    Li, Tianyi
    Yu, Jiong
    Taherkordi, Amir
    Jensen, Christian S.
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3704 - 3720
  • [23] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Choi, Seungkyu
    Sim, Jaehyeong
    Kang, Myeonggu
    Choi, Yeongjae
    Kim, Hyeonuk
    Kim, Lee-Sup
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702
  • [24] Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space
    Abrishami, Mohammad Saeed
    Eshratifar, Amir Erfan
    Eigen, David
    Wang, Yanzhi
    Nazarian, Shahin
    Pedram, Massoud
    PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 347 - 351
  • [25] Communication-Efficient Federated Learning With Binary Neural Networks
    Yang, Yuzhi
    Zhang, Zhaoyang
    Yang, Qianqian
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (12) : 3836 - 3850
  • [26] Deep Convolutional Neural Network
    Zhou, Yu
    Fang, Rui
    Liu, Peng
    Liu, Kai
    2019 PROCEEDINGS OF THE CONFERENCE ON CONTROL AND ITS APPLICATIONS, CT, 2019, : 46 - 51
  • [27] Communication-Efficient Stochastic Gradient MCMC for Neural Networks
    Li, Chunyuan
    Chen, Changyou
    Pu, Yunchen
    Henao, Ricardo
    Carin, Lawrence
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4173 - 4180
  • [28] Robust feature space separation for deep convolutional neural network training
    Sekmen A.
    Parlaktuna M.
    Abdul-Malek A.
    Erdemir E.
    Koku A.B.
    Discover Artificial Intelligence, 2021, 1 (01):
  • [29] FPGA Based Reconfigurable Coprocessor for Deep Convolutional Neural Network Training
    Clere, Sajna Remi
    Sachin, S.
    Varghese, Kuruvilla
    2018 21ST EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2018), 2018, : 381 - 388
  • [30] Intelligent Beam Training with Deep Convolutional Neural Network in mmWave Communications
    Wang, Zicun
    Shan, Wenxing
    Zhang, Lin
    Ma, Song
    Xiao, Ming
    Wei, Ning
    Li, Shaoqian
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1223 - 1228