Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput

被引:2
|
作者
Parthasarathy, Arjun [1 ]
Krishnamachari, Bhaskar [2 ]
机构
[1] Crystal Springs Uplands Sch, Hillsborough, CA 94010 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
D O I
10.1109/ITNAC55475.2022.9998427
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35% over a greedy joint partitioning-placement algorithm. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2% of the optimal bottleneck latency.
引用
收藏
页码:239 / 246
页数:8
相关论文
共 50 条
  • [31] Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices
    Merck, Matthew L.
    Wang, Bingyao
    Liu, Lixing
    Jia, Chunjun
    Siqueira, Arthur
    Huang, Qiusen
    Saraha, Abhijeet
    Lim, Dongsuk
    Cao, Jiashen
    Hadidi, Ramyad
    Kim, Hyesoon
    [J]. PEARC '19: PROCEEDINGS OF THE PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING ON RISE OF THE MACHINES (LEARNING), 2019,
  • [32] Adaptive Parallel Execution of Deep Neural Networks on Heterogeneous Edge Devices
    Zhou, Li
    Samavatian, Mohammad Hossein
    Bacha, Anys
    Majumdar, Saikat
    Teodorescu, Radu
    [J]. SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 195 - 208
  • [33] Distributed Deep Neural Network Deployment for Smart Devices from the Edge to the Cloud
    Lin, Chang-You
    Wang, Tzu-Chen
    Chen, Kuan-Chih
    Lee, Bor-Yan
    Kuo, Jian-Jhih
    [J]. PROCEEDINGS OF THE 2019 ACM MOBIHOCWORKSHOP ON PERVASIVE SYSTEMS IN THE IOT ERA (PERSIST-IOT '19), 2019, : 43 - 48
  • [34] Horizontally Distributed Inference of Deep Neural Networks for AI-Enabled IoT
    Rodriguez-Conde, Ivan
    Campos, Celso
    Fdez-Riverola, Florentino
    [J]. SENSORS, 2023, 23 (04)
  • [35] Inference Time Reduction of Deep Neural Networks on Embedded Devices: A Case Study
    Sadou, Isma-Ilou
    Nabavinejad, Seyed Morteza
    Lu, Zhonghai
    Ebrahimi, Masoumeh
    [J]. 2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 205 - 213
  • [36] Networks-on-Chip based Deep Neural Networks Accelerators for IoT Edge Devices
    Ascia, Giuseppe
    Catania, Vincenzo
    Monteleone, Salvatore
    Palesi, Maurizio
    Patti, Davide
    Jose, John
    [J]. 2019 SIXTH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS: SYSTEMS, MANAGEMENT AND SECURITY (IOTSMS), 2019, : 227 - 234
  • [37] Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices with AI Multi-Tenancy
    Hao, Jianwei
    Subedi, Piyush
    Ramaswamy, Lakshmish
    Kim, In Kee
    [J]. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2023, 23 (01)
  • [38] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
    Farrell, Max H.
    Liang, Tengyuan
    Misra, Sanjog
    [J]. ECONOMETRICA, 2021, 89 (01) : 181 - 213
  • [39] Property Inference for Deep Neural Networks
    Gopinath, Divya
    Converse, Hayes
    Pasareanu, Corina S.
    Taly, Ankur
    [J]. 34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
  • [40] Efficient neural networks for edge devices
    Liu, Shiya
    Ha, Dong Sam
    Shen, Fangyang
    Yi, Yang
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 92