DNN Placement and Inference in Edge Computing

被引：0

作者：

Bensalem, Mounir ^{[1
]}

Dizdarevic, Jasenka ^{[1
]}

Jukan, Admela ^{[1
]}

机构：

[1] Tech Univ Carolo Wilhelmina Braunschweig, Braunschweig, Germany

来源：

2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020) | 2020年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The deployment of deep neural network (DNN) models in software applications is increasing rapidly with the exponential growth of artificial intelligence. Currently, such models are deployed manually by developers in the cloud considering several user requirements, while the decision of model selection and user assignment is difficult to take. With the rise of edge computing paradigm, companies tend to deploy applications as close as possible to the user. Considering this system, the problem of DNN model selection and the inference serving becomes harder due to the introduction of communication latency between nodes. We present an automatic method for DNN placement and inference in edge computing; a mathematical formulation to the DNN Model Variant Selection and Placement (MVSP) problem is presented, it considers the inference latency of different model-variants, communication latency between nodes, and utilization cost of edge computing nodes. Furthermore, we propose a general heuristic algorithm to solve the MVSP problem. We provide an analysis of the effects of hardware sharing on inference latency, on an example of GPU edge computing nodes shared between different DNN model-variants. We evaluate our model numerically, and show the potentials of GPU sharing, with decreased average latency by 33% of millisecond-scale per request for low load, and by 21% for high load. We study the tradeoff between latency and cost and show the pareto optimal curves. Finally, we compare the optimal solution with the proposed heuristic and showed that the average latency per request increased by more than 60%. This can be improved using more efficient placement algorithms.

引用

页码：479 / 484

页数：6

共 50 条

[21] A Survey on Collaborative DNN Inference for Edge Intelligence
Ren, Wei-Qing
Qu, Yu-Ben
Dong, Chao
Jing, Yu-Qian
Sun, Hao
Wu, Qi-Hui
Guo, Song
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (03) : 370 - 395
[22] A Survey on Collaborative DNN Inference for Edge Intelligence
Wei-Qing Ren
Yu-Ben Qu
Chao Dong
Yu-Qian Jing
Hao Sun
Qi-Hui Wu
Song Guo
Machine Intelligence Research, 2023, 20 : 370 - 395
[23] DNN Partitioning for Inference Throughput Acceleration at the Edge
Feltin, Thomas
Marcho, Leo
Cordero-Fuertes, Juan-Antonio
Brockners, Frank
Clausen, Thomas H.
IEEE ACCESS, 2023, 11 : 52236 - 52249
[24] Energy-optimal DNN model placement in UAV-enabled edge computing networks
Jianhang Tang
Guoquan Wu
Mohammad Mussadiq Jalalzai
Lin Wang
Bing Zhang
Yi Zhou
Digital Communications and Networks, 2024, 10 (04) : 827 - 836
[25] Energy-optimal DNN model placement in UAV-enabled edge computing networks
Tang, Jianhang
Wu, Guoquan
Jalalzai, Mohammad Mussadiq
Wang, Lin
Zhang, Bing
Zhou, Yi
DIGITAL COMMUNICATIONS AND NETWORKS, 2024, 10 (04) : 827 - 836
[26] Edge intelligence in motion: Mobility-aware dynamic DNN inference service migration with downtime in mobile edge computing
Wang, Pu
Ouyang, Tao
Liao, Guocheng
Gong, Jie
Yu, Shuai
Chen, Xu
Journal of Systems Architecture, 2022, 130
[27] Edge intelligence in motion: Mobility-aware dynamic DNN inference service migration with downtime in mobile edge computing
Wang, Pu
Ouyang, Tao
Liao, Guocheng
Gong, Jie
Yu, Shuai
Chen, Xu
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 130
[28] A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system
Lei Shi
Zhigang Xu
Yabo Sun
Yi Shi
Yuqi Fan
Xu Ding
Peer-to-Peer Networking and Applications, 2021, 14 : 4031 - 4045
[29] Adaptive DNN Partition in Edge Computing Environments
Miao, Weiwei
Zeng, Zeng
Wei, Lei
Li, Shihao
Jiang, Chengling
Zhang, Zhen
2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 685 - 690
[30] DNN inference offloading for object detection in 5G multi-access edge computing
Kim, Geun-Yong
Kim, Ryangsoo
Kim, Sungchang
Nam, Ki-Dong
Rha, Sung-Uk
Yoon, Jung-Hyun
12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 389 - 392

← 1 2 3 4 5 →