Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引：1

作者：

Li, Hui ^{[1
]}

Li, Xiuhua ^{[1
]}

Fan, Qilin ^{[1
]}

He, Qiang ^{[2
]}

Wang, Xiaofei ^{[3
]}

Leung, Victor C. M. ^{[4
,5
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China

[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China

[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2024年 / 23卷 / 10期

关键词：

Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;

D O I：

10.1109/TMC.2024.3357874

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.

引用

页码：9060 / 9074

页数：15

共 50 条

[1] Fine-grained Resource Management for Edge Computing Satellite Networks
Wang, Feng
Jiang, Dingde
Qi, Sheng
Qiao, Chen
Song, Houbing
2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
[2] Fine-Grained Elastic Partitioning for Distributed DNN Towards Mobile Web AR Services in the 5G Era
Ren, Pei
Qiao, Xiuquan
Huang, Yakun
Liu, Ling
Pu, Calton
Dustdar, Schahram
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (06) : 3260 - 3274
[3] Energy conserving cost selection for fine-grained computational offloading in mobile edge computing networks
Numani, Abdullah
Abbas, Ziaul Haq
Abbas, Ghulam
Ali, Zaiwar
COMPUTER COMMUNICATIONS, 2024, 213 : 199 - 207
[4] Collaborative Inference Acceleration Integrating DNN Partitioning and Task Offloading in Mobile Edge Computing
Xu, Wenxiu
Yin, Yin
Chen, Ningjiang
Tu, Huan
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (11N12) : 1835 - 1863
[5] Throughput Maximization of Delay-Aware DNN Inference in Edge Computing by Exploring DNN Model Partitioning and Inference Parallelism
Li, Jing
Liang, Weifa
Li, Yuchen
Xu, Zichuan
Jia, Xiaohua
Guo, Song
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (05) : 3017 - 3030
[6] DNN Inference Acceleration with Partitioning and Early Exiting in Edge Computing
Li, Chao
Xu, Hongli
Xu, Yang
Wang, Zhiyuan
Huang, Liusheng
WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 465 - 478
[7] Fine-grained Cloud Edge Collaborative Dynamic Task Scheduling Based on DNN Layer-Partitioning
Wang, Xilong
Li, Xin
Wang, Ning
Qin, Xiaolin
2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 155 - 162
[8] Task Partitioning and Offloading in DNN-Task Enabled Mobile Edge Computing Networks
Gao, Mingjin
Shen, Rujing
Shi, Long
Qi, Wen
Li, Jun
Li, Yonghui
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (04) : 2435 - 2445
[9] Lookup Tables: Fine-Grained Partitioning for Distributed Databases
Tatarowicz, Aubrey L.
Curino, Carlo
Jones, Evan P. C.
Madden, Sam
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 102 - 113
[10] FedGreen: Federated Learning with Fine-Grained Gradient Compression for Green Mobile Edge Computing
Li, Peichun
Huang, Xumin
Pant, Miao
Yu, Rong
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,

← 1 2 3 4 5 →