Optimal distributed parallel algorithms for deep learning framework Tensorflow

被引:9
|
作者
Xie, Yuanlun [1 ]
He, Majun [1 ]
Ma, Tingsong [1 ]
Tian, Wenhong [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu, Peoples R China
关键词
Deep learning; Tensorflow; Data parallelism; Model parallelism; Optimal distributed parallel algorithms;
D O I
10.1007/s10489-021-02588-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since its release, the Tensorflow framework has been widely used in various fields due to its advantages in deep learning. However, it is still at its early state. Its native distributed implementation has difficulty in expanding for large models because it has issues of low utilization of multiple GPUs and slow distribution compared with running on single machine. It is of great significance to reduce the training time through parallel models. In view of this, we firstly provided an in-depth analysis of the implementation principle of Tensorflow and identify the bottlenecks of its native distributed parallel models to improve. Then, two optimal algorithms are designed and implemented based on data parallelism and model parallelism modes of Tensorflow. For data parallelism, the proposed algorithm is implemented to replace the native linear execution mode with pipeline execution mode. As for model parallelism, the native random partitioning mode is replaced by our proposed novel greedy algorithm. Finally, we built a homogeneous distributed cluster and a heterogeneous distributed cluster respectively to verify the effectiveness of the proposed algorithms. Through a number of comparative experiments, we showed that the proposed optimal parallel algorithms can effectively reduce model training time by an average of 26.5%(or average 1.5x speedup than native distributed algorithms) and improve the utilization of the cluster while keeping the same accuracy level of native Tensorflow.
引用
收藏
页码:3880 / 3900
页数:21
相关论文
共 50 条
  • [1] Optimal distributed parallel algorithms for deep learning framework Tensorflow
    Yuanlun Xie
    Majun He
    Tingsong Ma
    Wenhong Tian
    [J]. Applied Intelligence, 2022, 52 : 3880 - 3900
  • [2] Detailed Performance Analysis of Distributed Tensorflow on a GPU Cluster using Deep Learning Algorithms
    Malik, Abid
    Lu, Micheal
    Wang, Nathenial
    Lin, Yeiwei
    Yoo, Shinjae
    [J]. 2018 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2018,
  • [3] Distributed Deep Reinforcement Learning using TensorFlow
    Rao, P. Ajay
    Kumar, Navaneesh B.
    Cadabam, Siddharth
    Praveena, T.
    [J]. 2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 171 - 174
  • [4] Accelerating geostatistical seismic inversion using TensorFlow: A heterogeneous distributed deep learning framework
    Liu, Mingliang
    Grana, Dario
    [J]. COMPUTERS & GEOSCIENCES, 2019, 124 : 37 - 45
  • [5] OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning
    Jiang, Youhe
    Fu, Fangcheng
    Miao, Xupeng
    Nie, Xiaonan
    Cui, Bin
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2142 - 2150
  • [6] Boosting algorithms for parallel and distributed learning
    Lazarevic, A
    Obradovic, Z
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2002, 11 (02) : 203 - 229
  • [7] Boosting Algorithms for Parallel and Distributed Learning
    Aleksandar Lazarevic
    Zoran Obradovic
    [J]. Distributed and Parallel Databases, 2002, 11 : 203 - 229
  • [8] Deep Learning With TensorFlow: A Review
    Pang, Bo
    Nijkamp, Erik
    Wu, Ying Nian
    [J]. JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2020, 45 (02) : 227 - 248
  • [9] A Framework for Parallel Genetic Algorithms for Distributed Memory Architectures
    Georgiev, Dobromir
    Atanassov, Emanouil
    Alexandrov, Vassil
    [J]. 2014 5TH WORKSHOP ON LATEST ADVANCES IN SCALABLE ALGORITHMS FOR LARGE-SCALE SYSTEMS (SCALA), 2014, : 47 - 53
  • [10] PMA-DRL: A parallel model -augmented framework for deep reinforcement learning algorithms
    Luo, Xufang
    Wang, Yunhong
    [J]. NEUROCOMPUTING, 2020, 403 : 109 - 120