Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput

被引:2
|
作者
Parthasarathy, Arjun [1 ]
Krishnamachari, Bhaskar [2 ]
机构
[1] Crystal Springs Uplands Sch, Hillsborough, CA 94010 USA
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
D O I
10.1109/ITNAC55475.2022.9998427
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35% over a greedy joint partitioning-placement algorithm. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2% of the optimal bottleneck latency.
引用
收藏
页码:239 / 246
页数:8
相关论文
共 50 条
  • [1] Partitioning Convolutional Neural Networks to Maximize the Inference Rate on Constrained IoT Devices
    Campos de Oliveira, Fabiola Martins
    Bonin, Edson
    [J]. FUTURE INTERNET, 2019, 11 (10)
  • [2] Neural Networks Meet Physical Networks: Distributed Inference Between Edge Devices and the Cloud
    Chinchali, Sandeep P.
    Cidon, Eyal
    Pergament, Evgenya
    Chu, Tianshu
    Katti, Sachin
    [J]. HOTNETS-XVII: PROCEEDINGS OF THE 2018 ACM WORKSHOP ON HOT TOPICS IN NETWORKS, 2018, : 50 - 56
  • [3] Hybrid Partitioning for Embedded and Distributed CNNs Inference on Edge Devices
    Kaboubi, Nihel
    Letondeur, Loic
    Coupaye, Thierry
    Desprez, Frederic
    Trystram, Denis
    [J]. ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2022, PT I, 2023, 1797 : 164 - 187
  • [4] Distributed Deep Neural Networks over the Cloud, the Edge and End Devices
    Teerapittayanon, Surat
    McDanel, Bradley
    Kung, H. T.
    [J]. 2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 328 - 339
  • [5] Scaling for edge inference of deep neural networks
    Xu, Xiaowei
    Ding, Yukun
    Hu, Sharon Xiaobo
    Niemier, Michael
    Cong, Jason
    Hu, Yu
    Shi, Yiyu
    [J]. NATURE ELECTRONICS, 2018, 1 (04): : 216 - 222
  • [6] Scaling for edge inference of deep neural networks
    Xiaowei Xu
    Yukun Ding
    Sharon Xiaobo Hu
    Michael Niemier
    Jason Cong
    Yu Hu
    Yiyu Shi
    [J]. Nature Electronics, 2018, 1 : 216 - 222
  • [7] DNN Partitioning for Inference Throughput Acceleration at the Edge
    Feltin, Thomas
    Marcho, Leo
    Cordero-Fuertes, Juan-Antonio
    Brockners, Frank
    Clausen, Thomas H.
    [J]. IEEE ACCESS, 2023, 11 : 52236 - 52249
  • [8] Collaborative Inference for Deep Neural Networks in Edge Environments
    Liu, Meizhao
    Gu, Yingcheng
    Dong, Sen
    Wei, Liu
    Liu, Kai
    Yan, Yuting
    Song, Yu
    Cheng, Huanyu
    Tang, Lei
    Zhang, Sheng
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (07): : 1749 - 1773
  • [9] Distributed Deep Neural Network Training on Edge Devices
    Benditkis, Daniel
    Keren, Aviv
    Mor-Yosef, Liron
    Avidor, Tomer
    Shoham, Neta
    Tal-Israel, Nadav
    [J]. SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 304 - 306
  • [10] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    [J]. PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265