Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引:0
|
作者
Lee, Sunwoo [1 ]
Agrawal, Ankit [1 ]
Balaprakash, Prasanna [2 ]
Choudhary, Alok [1 ]
Liao, Wei-keng [1 ]
机构
[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA
[2] Argonne Natl Lab, Lemont, IL USA
关键词
Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;
D O I
10.1109/MLHPC.2018.000-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.
引用
收藏
页码:47 / 56
页数:10
相关论文
共 50 条
  • [1] A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
    Liu, Mingrui
    Zhuang, Zhenxun
    Lei, Yunwen
    Liao, Chunyang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Communication-efficient federated learning with stagewise training strategy
    Cheng, Yifei
    Shen, Shuheng
    Liang, Xianfeng
    Liu, Jingchang
    Chen, Joya
    Zhang, Tie
    Chen, Enhong
    NEURAL NETWORKS, 2023, 167 : 460 - 472
  • [3] Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy
    Zeng, Zihao
    Liu, Chubo
    Tang, Zhuo
    Chang, Wanli
    Li, Kenli
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1165 - 1170
  • [4] Privacy-preserving and Communication-efficient Convolutional Neural Network Prediction Framework in Mobile Cloud Computing
    Bai, Yanan
    Feng, Yong
    Wul, Wenyuan
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (12): : 4345 - 4363
  • [5] Parallel Deep Convolutional Neural Network Training by Exploiting the Overlapping of Computation and Communication
    Lee, Sunwoo
    Jha, Dipendra
    Agrawal, Ankit
    Choudhary, Alok
    Liao, Wei-keng
    2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 183 - 192
  • [6] Communication-Efficient Quantized SGD for Learning Polynomial Neural Network
    Yang, Zhanpeng
    Zhou, Yong
    Wu, Youlong
    Shi, Yuanming
    2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
  • [7] A Communication-Efficient Model of Sparse Neural Network for Distributed Intelligence
    Sheng, Yiqiang
    Wang, Jinlin
    Zhao, Zhenyu
    2016 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2016,
  • [8] Upgrade of Deep Neural Network-based Optical Monitors by Communication-Efficient Federated Learning
    Tanimura, Takahito
    Takase, Masayuki
    2023 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION, OFC, 2023,
  • [9] Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
    Guo, Zhishuai
    Liu, Mingrui
    Yuan, Zhuoning
    Shen, Li
    Liu, Wei
    Yang, Tianbao
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [10] Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
    Guo, Zhishuai
    Liu, Mingrui
    Yuan, Zhuoning
    Shen, Li
    Liu, Wei
    Yang, Tianbao
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,