Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引:1
|
作者
Li, Hui [1 ]
Li, Xiuhua [1 ]
Fan, Qilin [1 ]
He, Qiang [2 ]
Wang, Xiaofei [3 ]
Leung, Victor C. M. [4 ,5 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China
[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China
[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada
关键词
Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;
D O I
10.1109/TMC.2024.3357874
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.
引用
收藏
页码:9060 / 9074
页数:15
相关论文
共 50 条
  • [1] Fine-grained Resource Management for Edge Computing Satellite Networks
    Wang, Feng
    Jiang, Dingde
    Qi, Sheng
    Qiao, Chen
    Song, Houbing
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [2] Fine-Grained Elastic Partitioning for Distributed DNN Towards Mobile Web AR Services in the 5G Era
    Ren, Pei
    Qiao, Xiuquan
    Huang, Yakun
    Liu, Ling
    Pu, Calton
    Dustdar, Schahram
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (06) : 3260 - 3274
  • [3] Energy conserving cost selection for fine-grained computational offloading in mobile edge computing networks
    Numani, Abdullah
    Abbas, Ziaul Haq
    Abbas, Ghulam
    Ali, Zaiwar
    COMPUTER COMMUNICATIONS, 2024, 213 : 199 - 207
  • [4] Collaborative Inference Acceleration Integrating DNN Partitioning and Task Offloading in Mobile Edge Computing
    Xu, Wenxiu
    Yin, Yin
    Chen, Ningjiang
    Tu, Huan
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (11N12) : 1835 - 1863
  • [5] Throughput Maximization of Delay-Aware DNN Inference in Edge Computing by Exploring DNN Model Partitioning and Inference Parallelism
    Li, Jing
    Liang, Weifa
    Li, Yuchen
    Xu, Zichuan
    Jia, Xiaohua
    Guo, Song
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (05) : 3017 - 3030
  • [6] DNN Inference Acceleration with Partitioning and Early Exiting in Edge Computing
    Li, Chao
    Xu, Hongli
    Xu, Yang
    Wang, Zhiyuan
    Huang, Liusheng
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 465 - 478
  • [7] Fine-grained Cloud Edge Collaborative Dynamic Task Scheduling Based on DNN Layer-Partitioning
    Wang, Xilong
    Li, Xin
    Wang, Ning
    Qin, Xiaolin
    2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 155 - 162
  • [8] Task Partitioning and Offloading in DNN-Task Enabled Mobile Edge Computing Networks
    Gao, Mingjin
    Shen, Rujing
    Shi, Long
    Qi, Wen
    Li, Jun
    Li, Yonghui
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (04) : 2435 - 2445
  • [9] Lookup Tables: Fine-Grained Partitioning for Distributed Databases
    Tatarowicz, Aubrey L.
    Curino, Carlo
    Jones, Evan P. C.
    Madden, Sam
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 102 - 113
  • [10] FedGreen: Federated Learning with Fine-Grained Gradient Compression for Green Mobile Edge Computing
    Li, Peichun
    Huang, Xumin
    Pant, Miao
    Yu, Rong
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,