DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems

被引:0
|
作者
Rapp, Martin [1 ]
Khalili, Ramin [2 ]
Pfeiffer, Kilian [1 ]
Henkel, Joerg [1 ]
机构
[1] Karlsruhe Inst Technol, Karlsruhe, Germany
[2] Huawei Res Ctr, Munich, Germany
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of distributed training of neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability of computational resources. We present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully and efficiently utilize the available resources on devices in a distributed manner, increasing the convergence speed. This is achieved with a dropout mechanism that dynamically adjusts the computational complexity of training an NN by randomly dropping filters of convolutional layers of the model. Our main contribution is the introduction of a design space exploration (DSE) technique, which finds Pareto-optimal per-layer dropout vectors with respect to resource requirements and convergence speed of the training. Applying this technique, each device is able to dynamically select the dropout vector that fits its available resource without requiring any assistance from the server. We implement our solution in a federated learning (FL) system, where the availability of computational resources varies both between devices and over time, and show through extensive evaluation that we are able to significantly increase the convergence speed over the state of the art without compromising on the final accuracy.
引用
收藏
页码:8062 / 8071
页数:10
相关论文
共 50 条
  • [1] Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing
    Vasile, Mihaela-Andreea
    Pop, Florin
    Tutueanu, Radu-Ioan
    Cristea, Valentin
    Kolodziej, Joanna
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 51 : 61 - 71
  • [2] Efficient DAG Scheduling with Resource-Aware Clustering for Heterogeneous Systems
    Jedari, Behrouz
    Dehghan, Mahdi
    [J]. COMPUTER AND INFORMATION SCIENCE 2009, 2009, 208 : 249 - 261
  • [3] Resource-aware scientific computation on a heterogeneous cluster
    Teresco, JD
    Faik, J
    Flaherty, JE
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2005, 7 (02) : 40 - 50
  • [4] On-the-fly Resource-Aware Model Aggregation for Federated Learning in Heterogeneous Edge
    Nguyen, Hung T.
    Morabito, Roberto
    Kim, Kwang Taik
    Chiang, Mung
    [J]. 2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [5] Resource-aware in-edge distributed real-time deep learning
    Yoosefi, Amin
    Kargahi, Mehdi
    [J]. INTERNET OF THINGS, 2024, 27
  • [6] A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems
    Mei, Jing
    Li, Kenli
    Li, Keqin
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 68 (03): : 1347 - 1377
  • [7] Resource-Aware Partitioned Scheduling for Heterogeneous Multicore Real-Time Systems
    Han, Jian-Jun
    Cai, Wen
    Zhu, Dakai
    [J]. 2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [8] A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems
    Jing Mei
    Kenli Li
    Keqin Li
    [J]. The Journal of Supercomputing, 2014, 68 : 1347 - 1377
  • [9] Resource-Aware Device Allocation of Data-Parallel Applications on Heterogeneous Systems
    Kim, Donghyeon
    Kang, Seokwon
    Lim, Junsu
    Jung, Sunwook
    Kim, Woosung
    Park, Yongjun
    [J]. ELECTRONICS, 2020, 9 (11) : 1 - 18
  • [10] Resource-aware conference key establishment for heterogeneous networks
    Trappe, W
    Wang, Y
    Liu, KJR
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2005, 13 (01) : 134 - 146