Training deep neural networks: a static load balancing approach

被引:11
|
作者
Moreno-Alvarez, Sergio [1 ]
Haut, Juan M. [2 ]
Paoletti, Mercedes E. [2 ]
Rico-Gallego, Juan A. [1 ]
Diaz-Martin, Juan C. [2 ]
Plaza, Javier [2 ]
机构
[1] Univ Extremadura, Dept Comp Syst Engn & Telemat, Caceres, Spain
[2] Univ Extremadura, Dept Technol Comp & Commun, Caceres, Spain
来源
JOURNAL OF SUPERCOMPUTING | 2020年 / 76卷 / 12期
关键词
Deep learning; High-performance computing; Distributed training; Heterogeneous platforms;
D O I
10.1007/s11227-020-03200-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks are currently trained under data-parallel setups on high-performance computing (HPC) platforms, so that a replica of the full model is charged to each computational resource using non-overlapped subsets known as batches. Replicas combine the computed gradients to update their local copies at the end of each batch. However, differences in performance of resources assigned to replicas in current heterogeneous platforms induce waiting times when synchronously combining gradients, leading to an overall performance degradation. Albeit asynchronous communication of gradients has been proposed as an alternative, it suffers from the so-called staleness problem. This is due to the fact that the training in each replica is computed using a stale version of the parameters, which negatively impacts the accuracy of the resulting model. In this work, we study the application of well-known HPC static load balancing techniques to the distributed training of deep models. Our approach is assigning a different batch size to each replica, proportional to its relative computing capacity, hence minimizing the staleness problem. Our experimental results (obtained in the context of a remotely sensed hyperspectral image processing application) show that, while the classification accuracy is kept constant, the training time substantially decreases with respect to unbalanced training. This is illustrated using heterogeneous computing platforms, made up of CPUs and GPUs with different performance.
引用
收藏
页码:9739 / 9754
页数:16
相关论文
共 50 条
  • [41] Approach of Dynamic Load Balancing in Software Defined Networks with QoS
    Koryachko, Vyacheslav
    Perepelkin, Dmitry
    Byshov, Vladimir
    2017 6TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2017, : 359 - 363
  • [42] A load-balancing approach in Ad-hoc networks
    Ahn, S
    Lim, Y
    Choe, J
    INFORMATION NETWORKING: NETWORKING TECHNOLOGIES FOR ENHANCED INTERNET SERVICES, 2003, 2662 : 672 - 681
  • [43] Load balancing: An approach based on clustering in ad hoc networks
    Aoudjit, Rachida
    Lalam, Mustapha
    M'zoughi, Abdelaziz
    Belkadi, Malika
    Daoui, Mehammed
    Journal of Computing and Information Technology, 2009, 17 (02) : 177 - 184
  • [44] Firework inspired load balancing approach for wireless sensor networks
    Prasad, Ravi Kumar
    Madhu, Santanoo
    Ramotra, Prashant
    Edla, Damodar Reddy
    WIRELESS NETWORKS, 2021, 27 (06) : 4111 - 4122
  • [45] An adaptive load-balancing approach for ad hoc networks
    Yuan, YH
    Chen, HM
    Jia, M
    2005 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING PROCEEDINGS, VOLS 1 AND 2, 2005, : 743 - 746
  • [46] Integrated Load Balancing Approach for Fault Tolerance in MPLS Networks
    Singh, Ravindra Kumar
    Chaudhari, Narendra S.
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 295 - 298
  • [47] Load Balancing Algorithm Based on Neural Network in Heterogeneous Wireless Networks
    Song, Xin
    Wu, Liangming
    Ren, Xin
    Gao, Jing
    ADVANCES IN NEURAL NETWORKS - ISNN 2015, 2015, 9377 : 463 - 472
  • [48] PERFORMANCE MODELING OF LOAD-BALANCING ALGORITHMS USING NEURAL NETWORKS
    AHMAD, I
    GHAFOOR, A
    MEHROTRA, K
    MOHAN, CK
    RANKA, S
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1994, 6 (05): : 393 - 409
  • [49] Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks
    Aliaksandr Barushka
    Petr Hajek
    Applied Intelligence, 2018, 48 : 3538 - 3556
  • [50] Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks
    Barushka, Aliaksandr
    Hajek, Petr
    APPLIED INTELLIGENCE, 2018, 48 (10) : 3538 - 3556