Distributed Deep Learning on Wimpy Smartphone Nodes

被引：0

作者：

Hemed, Tzoof ^{[1
]}

Lavie, Nitai ^{[1
]}

Kaplan, Roman ^{[1
]}

机构：

[1] Technion Israel Inst Technol, Haifa, Israel

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE) | 2018年

关键词：

Deep Learning; Distributed Computing; Mobile Computing; Smartphones;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep Neural Networks (DNN), contain multiple convolutional and several fully connected layers, require considerable hardware resources to train in a reasonable time. Multiple CPUs, GPUs or FPGAs are usually combined to reduce the training time of a DNN. However, many individuals or small organizations do not possess the resources to obtain multiple hardware units. The contribution of this work is two-fold. First, we present an implementation of a distributed DNN training system that uses multiple small (wimpy) nodes to accelerate the training process. The nodes are mobile smartphone devices, with variable hardware specifications. All DNN training tasks are performed on the small nodes, coordinated by a centralized server. Second, we propose a novel method to mitigate issues arising from the variability in hardware resources. We demonstrate that the method allows training a DNN to high accuracy on known image recognition datasets with multiple small different nodes. The proposed method factors in the contribution from each node according to its run time on a specific training task, relative to the other nodes. In addition, we discuss practical challenges that arise from small node system and suggest several solutions.

引用

页数：5

共 50 条

[41] Visual Anomaly Detection by Distributed Deep Learning
Hu, Ruiguang
Sun, Peng
Ge, Yifan
[J]. AOPC 2020: OPTICAL SENSING AND IMAGING TECHNOLOGY, 2020, 11567
[42] Distributed deep reinforcement learning for simulation control
Pawar, Suraj
Maulik, Romit
[J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (02):
[43] Elastic Resource Sharing for Distributed Deep Learning
Hwang, Changho
Kim, Taehyun
Kim, Sunghyun
Shin, Jinwoo
Park, KyoungSoo
[J]. PROCEEDINGS OF THE 18TH USENIX SYMPOSIUM ON NETWORKED SYSTEM DESIGN AND IMPLEMENTATION, 2021, : 721 - 740
[44] Interactive Distributed Deep Learning with Jupyter Notebooks
Farrell, Steve
Vose, Aaron
Evans, Oliver
Henderson, Matthew
Cholia, Shreyas
Perez, Fernando
Bhimji, Wahid
Canon, Shane
Thomas, Rollin
Prabhat
[J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 11203 : 678 - 687
[45] Selective Preemption of Distributed Deep Learning Training
Go, Younghun
Shin, Changyong
Lee, Jeunghwan
Yoo, Yeonho
Yang, Gyeongsik
Yoo, Chuck
[J]. 2023 IEEE 16TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, CLOUD, 2023, : 175 - 177
[46] Distributed Emergent Agreements with Deep Reinforcement Learning
Schmid, Kyrill
Mueller, Robert
Belzner, Lenz
Tochtermann, Johannes
Linhoff-Popien, Claudia
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[47] Metaoptimization on a Distributed System for Deep Reinforcement Learning
Heinrich, Greg
Frosio, Iuri
[J]. PROCEEDINGS OF 2019 5TH IEEE/ACM WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2019), 2019, : 19 - 30
[48] PRIVACY PRESERVING DEEP LEARNING WITH DISTRIBUTED ENCODERS
Zhang, Yitian
Salehinejad, Hojjat
Barfett, Joseph
Colak, Errol
Valaee, Shahrokh
[J]. 2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
[49] Performance Analysis of Distributed and Scalable Deep Learning
Mahon, Sean
Varrette, Sebastien
Plugaru, Valentin
Pinel, Frederic
Bouvry, Pascal
[J]. 2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 760 - 766
[50] Learned Gradient Compression for Distributed Deep Learning
Abrahamyan, Lusine
Chen, Yiming
Bekoulis, Giannis
Deligiannis, Nikos
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7330 - 7344

← 1 2 3 4 5 →