Multi-Model Running Latency Optimization in an Edge Computing Paradigm

被引:12
|
作者
Li, Peisong [1 ]
Wang, Xinheng [1 ]
Huang, Kaizhu [2 ]
Huang, Yi [3 ]
Li, Shancang [4 ]
Iqbal, Muddesar [5 ]
机构
[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou 215123, Peoples R China
[2] Duke Kunshan Univ, Data Sci Res Ctr, Div Nat & Appl Sci, Suzhou 215316, Peoples R China
[3] Univ Liverpool, Dept Elect Engn & Elect, Liverpool L69 3BX, Merseyside, England
[4] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 3AT, Wales
[5] Prince Sultan Univ, Coll Engn, Commun & Networks Engn Dept, Renewable Energy Lab, Riyadh 11586, Saudi Arabia
基金
中国国家自然科学基金;
关键词
edge computing; latency optimization; multi-model; task scheduling; autonomous driving; AI;
D O I
10.3390/s22166097
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recent advances in both lightweight deep learning algorithms and edge computing increasingly enable multiple model inference tasks to be conducted concurrently on resource-constrained edge devices, allowing us to achieve one goal collaboratively rather than getting high quality in each standalone task. However, the high overall running latency for performing multi-model inferences always negatively affects the real-time applications. To combat latency, the algorithms should be optimized to minimize the latency for multi-model deployment without compromising the safety-critical situation. This work focuses on the real-time task scheduling strategy for multi-model deployment and investigating the model inference using an open neural network exchange (ONNX) runtime engine. Then, an application deployment strategy is proposed based on the container technology and inference tasks are scheduled to different containers based on the scheduling strategies. Experimental results show that the proposed solution is able to significantly reduce the overall running latency in real-time applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Multi-Slot Dynamic Computing Resource Optimization in Edge Computing
    Chen, Pengyu
    Xu, Han
    Fan, Xingwang
    Hu, Jing
    Song, Tiecheng
    2022 14TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING, WCSP, 2022, : 160 - 165
  • [22] Joint Optimization of Latency and Reward for Offloading Dependent Tasks in Mobile Edge Computing
    Gong, Yanqi
    Hao, Fei
    Sun, Yifei
    Guo, Longjiang
    20TH INT CONF ON UBIQUITOUS COMP AND COMMUNICAT (IUCC) / 20TH INT CONF ON COMP AND INFORMATION TECHNOLOGY (CIT) / 4TH INT CONF ON DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) / 11TH INT CONF ON SMART COMPUTING, NETWORKING, AND SERV (SMARTCNS), 2021, : 68 - 75
  • [23] Offloading Optimization for Low-Latency Secure Mobile Edge Computing Systems
    Zhou, Yi
    Yeoh, Phee Lep
    Pan, Cunhua
    Wang, Kezhi
    Elkashlan, Maged
    Wang, Zhongfeng
    Vucetic, Branka
    Li, Yonghui
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2020, 9 (04) : 480 - 484
  • [24] Machine Learning Driven Latency Optimization for Internet of Things Applications in Edge Computing
    Uchechukwu AWADA
    ZHANG Jiankang
    CHEN Sheng
    LI Shuangzhi
    YANG Shouyi
    ZTE Communications, 2023, 21 (02) : 40 - 52
  • [25] Latency Optimization for Mobile Edge Computing Based Proximity Detection in Road Networks
    Song, Yunlong
    Liu, Yaqiong
    Shou, Guochu
    Hu, Yihong
    2020 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC WORKSHOPS), 2020, : 145 - 150
  • [26] A robust optimization approach for placement of applications in edge computing considering latency uncertaint
    Jeong, Jaehee
    Premsankar, Gopika
    Ghaddar, Bissan
    Tarkoma, Sasu
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2024, 126
  • [27] Studying Offloading Optimization for Energy-Latency Tradeoff with Collaborative Edge Computing
    Padidem, Pranathi
    Lee, Ahyoung
    PROCEEDINGS OF THE 2022 16TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2022), 2022,
  • [28] Joint Optimization of Energy Consumption and Latency in Mobile Edge Computing for Internet of Things
    Cui, Laizhong
    Xu, Chong
    Yang, Shu
    Huang, Joshua Zhexue
    Li, Jianqiang
    Wang, Xizhao
    Ming, Zhong
    Lu, Nan
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (03): : 4791 - 4803
  • [29] UAV-Aided Low Latency Multi-Access Edge Computing
    Yu, Ye
    Bu, Xiangyuan
    Yang, Kai
    Yang, Hongyuan
    Gao, Xiaozheng
    Han, Zhu
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (05) : 4955 - 4967
  • [30] A Multi-Model Power Estimation Engine for Accuracy Optimization
    Klein, Felipe
    Araujo, G.
    Azevedo, Rodolfo
    Leao, Roberto
    dos Santos, Luiz C. V.
    ISLPED'07: PROCEEDINGS OF THE 2007 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2007, : 280 - 285