Pipeline Parallelism With Elastic Averaging

被引:0
|
作者
Jang, Bongwon [1 ]
Yoo, In-Chul [1 ]
Yook, Dongsuk [1 ]
机构
[1] Korea University, Artificial Intelligence Laboratory, Department of Computer Science and Engineering, Seoul,02841, Korea, Republic of
关键词
To accelerate the training speed of massive DNN models on large-scale datasets; distributed training techniques; including data parallelism and model parallelism; have been extensively studied. In particular; pipeline parallelism; which is derived from model parallelism; has been attracting attention. It splits the model parameters across multiple computing nodes and executes multiple mini-batches simultaneously. However; naive pipeline parallelism suffers from the issues of weight inconsistency and delayed gradients; as the model parameters used in the forward and backward passes do not match; causing unstable training and low performance. In this study; we propose a novel pipeline parallelism technique called EA-Pipe to address the weight inconsistency and delayed gradient problems. EA-Pipe applies an elastic averaging method; which has been studied in the context of data parallelism; to pipeline parallelism. The proposed method maintains multiple model replicas to solve the weight inconsistency problem; and synchronizes the model replicas using an elasticity-based moving average method to mitigate the delayed gradient problem. To verify the efficacy of the proposed method; we conducted three image classification experiments on the CIFAR-10/100 and ImageNet datasets. The experimental results show that EA-Pipe not only accelerates training speed but also demonstrates more stable learning property compared to existing pipeline parallelism techniques. Especially; in the experiments using the CIFAR-100 and ImageNet datasets; EA-Pipe recorded error rates that were 2.58% and 2.19% lower; respectively; than the baseline pipeline parallelization method. © 2013 IEEE;
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
页码:5477 / 5489
相关论文
共 50 条
  • [1] Pipeline Parallelism With Elastic Averaging
    Jang, Bongwon
    Yoo, In-Chul
    Yook, Dongsuk
    IEEE ACCESS, 2024, 12 : 5477 - 5489
  • [2] BUILDING PARALLELISM INTO THE INSTRUCTION PIPELINE
    CHAN, S
    HORST, R
    HIGH PERFORMANCE SYSTEMS-THE MAGAZINE FOR TECHNOLOGY CHAMPIONS, 1989, 10 (12): : 53 - &
  • [3] On-the-fly pipeline parallelism
    Lee, I.-T.A.
    Leiserson, C.E.
    Schardl, T.B.
    Zhang, Z.
    Sukha, J.
    ACM Transactions on Parallel Computing, 2015, 2 (03)
  • [4] Analytical Modeling of Pipeline Parallelism
    Navarro, Angeles
    Asenjo, Rafael
    Tabik, Siham
    Cascaval, Calin
    18TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 2009, : 281 - +
  • [5] Load-Balanced Pipeline Parallelism
    Kamruzzaman, Md
    Swanson, Steven
    Tullsen, Dean M.
    2013 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2013,
  • [6] A flexible communication mechanism for pipeline parallelism
    Wang, Junchang
    Tian, Yangfeng
    Li, Tao
    Fu, Xiong
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 778 - 785
  • [7] Feedback-Directed Pipeline Parallelism
    Suleman, M. Aater
    Qureshi, Moinuddin K.
    Khubaib
    Patt, Yale N.
    PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, : 147 - 156
  • [8] PiPar: : Pipeline parallelism for collaborative machine learning
    Zhang, Zihan
    Kilpatrick, Peter
    Spence, Ivor
    Varghese, Blesson
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 193
  • [9] USING PARALLELISM AND PIPELINE FOR THE OPTIMIZATION OF JOIN QUERIES
    SPILIOPOULOU, M
    HATZOPOULOS, M
    VASSILAKIS, C
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 605 : 279 - 294
  • [10] PipeDream: Generalized Pipeline Parallelism for DNN Training
    Narayanan, Deepak
    Harlap, Aaron
    Phanishayee, Amar
    Seshadri, Vivek
    Devanur, Nikhil R.
    Ganger, Gregory R.
    Gibbons, Phillip B.
    Zaharia, Matei
    PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, : 1 - 15