Partitioning Convolutional Neural Networks to Maximize the Inference Rate on Constrained IoT Devices

被引:19
|
作者
Campos de Oliveira, Fabiola Martins [1 ]
Bonin, Edson [1 ]
机构
[1] Univ Estadual Campinas, Inst Comp, BR-13083852 Campinas, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Internet of Things; convolutional neural networks; graph partitioning; distributed systems; resource-efficient inference; INTERNET; THINGS; RECOGNITION; QUALITY;
D O I
10.3390/fi11100209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Billions of devices will compose the IoT system in the next few years, generating a huge amount of data. We can use fog computing to process these data, considering that there is the possibility of overloading the network towards the cloud. In this context, deep learning can treat these data, but the memory requirements of deep neural networks may prevent them from executing on a single resource-constrained device. Furthermore, their computational requirements may yield an unfeasible execution time. In this work, we propose Deep Neural Networks Partitioning for Constrained IoT Devices, a new algorithm to partition neural networks for efficient distributed execution. Our algorithm can optimize the neural network inference rate or the number of communications among devices. Additionally, our algorithm accounts appropriately for the shared parameters and biases of Convolutional Neural Networks. We investigate the inference rate maximization for the LeNet model in constrained setups. We show that the partitionings offered by popular machine learning frameworks such as TensorFlow or by the general-purpose framework METIS may produce invalid partitionings for very constrained setups. The results show that our algorithm can partition LeNet for all the proposed setups, yielding up to 38% more inferences per second than METIS.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Partitioning convolutional neural networks for inference on constrained Internet-of-Things devices
    Campos de Oliveira, Fabiola Martins
    Borin, Edson
    [J]. 2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 266 - 273
  • [2] Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput
    Parthasarathy, Arjun
    Krishnamachari, Bhaskar
    [J]. 2022 32ND INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2022, : 239 - 246
  • [3] Optimization of Convolutional Neural Networks on Resource Constrained Devices
    Arish, S.
    Sinha, Sharad
    Smitha, K. G.
    [J]. 2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 19 - 24
  • [4] Manto: A Practical and Secure Inference Service of Convolutional Neural Networks for IoT
    Cheng, Ke
    Fu, Jiaxuan
    Shen, Yulong
    Gao, Haichang
    Xi, Ning
    Zhang, Zhiwei
    Zhu, Xinghui
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (16) : 14856 - 14872
  • [5] Cloud-assisted collaborative inference of convolutional neural networks for vision tasks on resource-constrained devices
    Rodriguez-Conde, Ivan
    Campos, Celso
    Fdez-Riverola, Florentino
    [J]. NEUROCOMPUTING, 2023, 560
  • [6] Multi-Fidelity Matryoshka Neural Networks for Constrained IoT Devices
    Leroux, Sam
    Bohez, Steven
    De Coninck, Elias
    Verbelen, Tim
    Vankeirsbilck, Bert
    Simoens, Pieter
    Dhoedt, Bart
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1305 - 1309
  • [7] Iterative neural networks for adaptive inference on resource-constrained devices
    Sam Leroux
    Tim Verbelen
    Pieter Simoens
    Bart Dhoedt
    [J]. Neural Computing and Applications, 2022, 34 : 10321 - 10336
  • [8] Iterative neural networks for adaptive inference on resource-constrained devices
    Leroux, Sam
    Verbelen, Tim
    Simoens, Pieter
    Dhoedt, Bart
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (13): : 10321 - 10336
  • [9] PANCODE: Multilevel Partitioning of Neural Networks for Constrained Internet-of-Things Devices
    de Oliveira, Fabiola Martins Campos
    Bittencourt, Luiz Fernando
    Kamienski, Carlos Alberto
    Borin, Edson
    [J]. IEEE ACCESS, 2023, 11 : 2058 - 2077
  • [10] Convolutional Neural Networks for audio classification on ultra low power IoT devices
    Andreadis, Alessandro
    Giambene, Giovanni
    Zambon, Riccardo
    [J]. 2021 IEEE INTERNATIONAL BLACK SEA CONFERENCE ON COMMUNICATIONS AND NETWORKING (IEEE BLACKSEACOM), 2021, : 77 - 82