Heterogeneous gradient computing optimization for scalable deep neural networks

被引：0

作者：

Sergio Moreno-Álvarez

Mercedes E. Paoletti

Juan A. Rico-Gallego

Juan M. Haut

机构：

[1] University of Extremadura,Department of Computer Systems Engineering and Telematics

[2] Complutense University of Madrid,Department of Computer Architecture

[3] University of Extremadura,Department of Technology of Computers and Communications

来源：

The Journal of Supercomputing | 2022年 / 78卷

关键词：

Deep learning; Deep neural networks; High-performance computing; Heterogeneous platforms; Distributed training;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Nowadays, data processing applications based on neural networks cope with the growth in the amount of data to be processed and with the increase in both the depth and complexity of the neural networks architectures, and hence in the number of parameters to be learned. High-performance computing platforms are provided with fast computing resources, including multi-core processors and graphical processing units, to manage such computational burden of deep neural network applications. A common optimization technique is to distribute the workload between the processes deployed on the resources of the platform. This approach is known as data-parallelism. Each process, known as replica, trains its own copy of the model on a disjoint data partition. Nevertheless, the heterogeneity of the computational resources composing the platform requires to unevenly distribute the workload between the replicas according to its computational capabilities, to optimize the overall execution performance. Since the amount of data to be processed is different in each replica, the influence of the gradients computed by the replicas in the global parameter updating should be different. This work proposes a modification of the gradient computation method that considers the different speeds of the replicas, and hence, its amount of data assigned. The experimental results have been conducted on heterogeneous high-performance computing platforms for a wide range of models and datasets, showing an improvement in the final accuracy with respect to current techniques, with a comparable performance.

引用

页码：13455 / 13469

页数：14

共 50 条

[1] Heterogeneous gradient computing optimization for scalable deep neural networks
Moreno-Alvarez, Sergio
Paoletti, Mercedes E.
Rico-Gallego, Juan A.
Haut, Juan M.
[J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (11): : 13455 - 13469
[2] Scalable Bayesian Optimization Using Deep Neural Networks
Snoek, Jasper
Rippel, Oren
Swersky, Kevin
Kiros, Ryan
Satish, Nadathur
Sundaram, Narayanan
Patwary, Md. Mostofa Ali
Prabhat
Adams, Ryan P.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2171 - 2180
[3] Evolutionary Training of Deep Neural Networks on Heterogeneous Computing Environments
Kalia, Subodh
Mohan, Chilukuri K.
Nemani, Ramakrishna
[J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 2318 - 2321
[4] Towards Ubiquitous Intelligent Computing: Heterogeneous Distributed Deep Neural Networks
Zhang, Zongpu
Song, Tao
Lin, Liwei
Hua, Yang
He, Xufeng
Xue, Zhengui
Ma, Ruhui
Guan, Haibing
[J]. IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (03) : 644 - 657
[5] Learning dynamics of gradient descent optimization in deep neural networks
Wu, Wei
Jing, Xiaoyuan
Du, Wencai
Chen, Guoliang
[J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
[6] Learning dynamics of gradient descent optimization in deep neural networks
Wei WU
Xiaoyuan JING
Wencai DU
Guoliang CHEN
[J]. Science China(Information Sciences), 2021, 64 (05) : 17 - 31
[7] Learning dynamics of gradient descent optimization in deep neural networks
Wei Wu
Xiaoyuan Jing
Wencai Du
Guoliang Chen
[J]. Science China Information Sciences, 2021, 64
[8] Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Cui, Xiaodong
Zhang, Wei
Tuske, Zoltan
Picheny, Michael
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[9] Targeted and Automatic Deep Neural Networks Optimization for Edge Computing
Giovannesi, Luca
Mattia, Gabriele Proietti
Beraldi, Roberto
[J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 5, AINA 2024, 2024, 203 : 57 - 68
[10] Strengthening Gradient Descent by Sequential Motion Optimization for Deep Neural Networks
Le-Duc, Thang
Nguyen, Quoc-Hung
Lee, Jaehong
Nguyen-Xuan, H.
[J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (03) : 565 - 579

← 1 2 3 4 5 →