Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput

被引：2

作者：

Parthasarathy, Arjun ^{[1
]}

Krishnamachari, Bhaskar ^{[2
]}

机构：

[1] Crystal Springs Uplands Sch, Hillsborough, CA 94010 USA

[2] Univ Southern Calif, Los Angeles, CA 90007 USA

来源：

2022 32ND INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC) | 2022年

关键词：

D O I：

10.1109/ITNAC55475.2022.9998427

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35% over a greedy joint partitioning-placement algorithm. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2% of the optimal bottleneck latency.

引用

页码：239 / 246

页数：8

共 50 条

[31] Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices
Merck, Matthew L.
Wang, Bingyao
Liu, Lixing
Jia, Chunjun
Siqueira, Arthur
Huang, Qiusen
Saraha, Abhijeet
Lim, Dongsuk
Cao, Jiashen
Hadidi, Ramyad
Kim, Hyesoon
[J]. PEARC '19: PROCEEDINGS OF THE PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING ON RISE OF THE MACHINES (LEARNING), 2019,
[32] Adaptive Parallel Execution of Deep Neural Networks on Heterogeneous Edge Devices
Zhou, Li
Samavatian, Mohammad Hossein
Bacha, Anys
Majumdar, Saikat
Teodorescu, Radu
[J]. SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 195 - 208
[33] Distributed Deep Neural Network Deployment for Smart Devices from the Edge to the Cloud
Lin, Chang-You
Wang, Tzu-Chen
Chen, Kuan-Chih
Lee, Bor-Yan
Kuo, Jian-Jhih
[J]. PROCEEDINGS OF THE 2019 ACM MOBIHOCWORKSHOP ON PERVASIVE SYSTEMS IN THE IOT ERA (PERSIST-IOT '19), 2019, : 43 - 48
[34] Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT
Rodriguez-Conde, Ivan
Campos, Celso
Fdez-Riverola, Florentino
[J]. SENSORS, 2023, 23 (04)
[35] Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study
Sadou, Isma-Ilou
Nabavinejad, Seyed Morteza
Lu, Zhonghai
Ebrahimi, Masoumeh
[J]. 2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 205 - 213
[36] Networks-on-Chip based Deep Neural Networks Accelerators for IoT Edge Devices
Ascia, Giuseppe
Catania, Vincenzo
Monteleone, Salvatore
Palesi, Maurizio
Patti, Davide
Jose, John
[J]. 2019 SIXTH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS: SYSTEMS, MANAGEMENT AND SECURITY (IOTSMS), 2019, : 227 - 234
[37] Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices with AI Multi-Tenancy
Hao, Jianwei
Subedi, Piyush
Ramaswamy, Lakshmish
Kim, In Kee
[J]. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2023, 23 (01)
[38] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
Farrell, Max H.
Liang, Tengyuan
Misra, Sanjog
[J]. ECONOMETRICA, 2021, 89 (01) : 181 - 213
[39] Property Inference for Deep Neural Networks
Gopinath, Divya
Converse, Hayes
Pasareanu, Corina S.
Taly, Ankur
[J]. 34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
[40] Efficient neural networks for edge devices
Liu, Shiya
Ha, Dong Sam
Shen, Fangyang
Yi, Yang
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 92

← 1 2 3 4 5 →