Accelerating neural network training with distributed asynchronous and selective optimization (DASO)

被引:4
|
作者
Coquelin, Daniel [1 ]
Debus, Charlotte [1 ]
Goetz, Markus [1 ]
von der Lehr, Fabrice [2 ]
Kahn, James [1 ]
Siggel, Martin [2 ]
Streit, Achim [1 ]
机构
[1] Karlsruhe Inst Technol, Hermann Von Helmholtz Pl 1, D-76344 Eggenstein Leopoldshafen, Germany
[2] German Aerosp Ctr, D-51147 Cologne, Germany
关键词
Machine learning; Neural networks; Data parallel training; Multi-node; Multi-GPU; Stale gradients;
D O I
10.1186/s40537-021-00556-1
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) and large-scale distributed resources on computer clusters. Current DPNN approaches implement the network parameter updates by synchronizing and averaging gradients across all processes with blocking communication operations after each forward-backward pass. This synchronization is the central algorithmic bottleneck. We introduce the distributed asynchronous and selective optimization (DASO) method, which leverages multi-GPU compute node architectures to accelerate network training while maintaining accuracy. DASO uses a hierarchical and asynchronous communication scheme comprised of node-local and global networks while adjusting the global synchronization rate during the learning process. We show that DASO yields a reduction in training time of up to 34% on classical and state-of-the-art networks, as compared to current optimized data parallel training methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A Distributed Particle Swarm Optimization Algorithm Based on Apache Spark for Asynchronous Parallel Training of Deep Neural Networks
    Capel, Manuel, I
    Holgado-Terriza, Juan A.
    Galiana-Velasco, Sergio
    Salguero, Alberto G.
    [J]. 53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 76 - 85
  • [22] Global optimization for neural network training
    Shang, Y
    Wah, BW
    [J]. COMPUTER, 1996, 29 (03) : 45 - +
  • [23] Distributed Graph Neural Network Training: A Survey
    Shao, Yingxia
    Li, Hongzheng
    Gu, Xizhi
    Yin, Hongbo
    Li, Yawen
    Miao, Xupeng
    Zhang, Wentao
    Cui, Bin
    Chen, Lei
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (08)
  • [24] NeuralGenesis: a software for distributed neural network training
    Tsoulos, Ioannis
    Tzallas, Alexandros T.
    Tsalikakis, Dimitrios G.
    Giannakeas, Nikolaos
    Tsipouras, Markos G.
    Androulidakis, Iosif
    Zaitseva, Elena
    [J]. 2016 24TH TELECOMMUNICATIONS FORUM (TELFOR), 2016, : 841 - 844
  • [25] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
    Cai, Tianle
    Luo, Shengjie
    Xu, Keyulu
    He, Di
    Liu, Tie-Yan
    Wang, Liwei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [26] fuseGNN: Accelerating Graph Convolutional Neural Network Training on GPGPU
    Chen, Zhaodong
    Yan, Mingyu
    Zhu, Maohua
    Deng, Lei
    Li, Guoqi
    Li, Shuangchen
    Xie, Yuan
    [J]. 2020 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED-DESIGN (ICCAD), 2020,
  • [27] Accelerating Neural Network Training with Processing-in-Memory GPU
    Fei, Xiang
    Han, Jianhui
    Huang, Jianqiang
    Zheng, Weimin
    Zhang, Youhui
    [J]. 2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 414 - 421
  • [28] An Artificial Neural Network for Distributed Constrained Optimization
    Liu, Na
    Jia, Wenwen
    Qin, Sitian
    Li, Guocheng
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2018), PT II, 2018, 11302 : 430 - 441
  • [29] Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees
    Kungurtsev, Vyacheslav
    Egan, Malcolm
    Chatterjee, Bapi
    Alistarh, Dan
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8209 - 8216
  • [30] Accelerating deep neural network training for action recognition on a cluster of GPUs
    Cong, Guojing
    Domeniconi, Giacomo
    Shapiro, Joshua
    Zhou, Fan
    Chen, Barry
    [J]. 2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 298 - 305