Multi-Model Running Latency Optimization in an Edge Computing Paradigm

被引:12
|
作者
Li, Peisong [1 ]
Wang, Xinheng [1 ]
Huang, Kaizhu [2 ]
Huang, Yi [3 ]
Li, Shancang [4 ]
Iqbal, Muddesar [5 ]
机构
[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou 215123, Peoples R China
[2] Duke Kunshan Univ, Data Sci Res Ctr, Div Nat & Appl Sci, Suzhou 215316, Peoples R China
[3] Univ Liverpool, Dept Elect Engn & Elect, Liverpool L69 3BX, Merseyside, England
[4] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 3AT, Wales
[5] Prince Sultan Univ, Coll Engn, Commun & Networks Engn Dept, Renewable Energy Lab, Riyadh 11586, Saudi Arabia
基金
中国国家自然科学基金;
关键词
edge computing; latency optimization; multi-model; task scheduling; autonomous driving; AI;
D O I
10.3390/s22166097
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recent advances in both lightweight deep learning algorithms and edge computing increasingly enable multiple model inference tasks to be conducted concurrently on resource-constrained edge devices, allowing us to achieve one goal collaboratively rather than getting high quality in each standalone task. However, the high overall running latency for performing multi-model inferences always negatively affects the real-time applications. To combat latency, the algorithms should be optimized to minimize the latency for multi-model deployment without compromising the safety-critical situation. This work focuses on the real-time task scheduling strategy for multi-model deployment and investigating the model inference using an open neural network exchange (ONNX) runtime engine. Then, an application deployment strategy is proposed based on the container technology and inference tasks are scheduled to different containers based on the scheduling strategies. Experimental results show that the proposed solution is able to significantly reduce the overall running latency in real-time applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] A Global Optimization Approach to Robust Multi-Model Fitting
    Yu, Jin
    Chin, Tat-Jun
    Suter, David
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,
  • [32] Multi-model Optimization with Discounted Reward and Budget Constraint
    Shi, Jixuan
    Chen, Mei
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2018), 2018, : 10 - 14
  • [33] Optimization of multi-model ensemble forecasting of typhoon waves
    Pan, Shun-qi
    Fan, Yang-ming
    Chen, Jia-ming
    Kao, Chia-chuen
    WATER SCIENCE AND ENGINEERING, 2016, 9 (01) : 52 - 57
  • [34] Optimization of multi-model ensemble forecasting of typhoon waves
    Shun-qi Pan
    Yang-ming Fan
    Jia-ming Chen
    Chia-chuen Kao
    Water Science and Engineering, 2016, 9 (01) : 52 - 57
  • [35] The dynamic programming approach to multi-model robust optimization
    Azhmyakov, Vadim
    Boltyanski, Vladimir
    Poznyak, Alexander
    NONLINEAR ANALYSIS-THEORY METHODS & APPLICATIONS, 2010, 72 (02) : 1110 - 1119
  • [36] A scheduling framework for latency optimization on 5G mobile edge computing infrastructures
    Carpentieri, Bruno
    Palmieri, Francesco
    2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
  • [37] Bi-objective optimization for multi-task offloading in latency and radio resources constrained mobile edge computing networks
    Youssef Hmimz
    Tarik Chanyour
    Mohamed El Ghmary
    Mohammed Ouçamah Cherkaoui Malki
    Multimedia Tools and Applications, 2021, 80 : 17129 - 17166
  • [38] A Multi-model Optimization Framework for the Model Driven Design of Cloud Applications
    Ardagna, Danilo
    Gibilisco, Giovanni Paolo
    Ciavotta, Michele
    Lavrentev, Alexander
    SEARCH-BASED SOFTWARE ENGINEERING, 2014, 8636 : 61 - 76
  • [39] Latency Optimization in UAV-Assisted Mobile Edge Computing Empowered by Caching Mechanisms
    Zhang, Heng
    Sun, Zhemin
    Yang, Chaoqun
    Cao, Xianghui
    IEEE Journal on Miniaturization for Air and Space Systems, 2024, 5 (04): : 228 - 236
  • [40] Network Resource Optimization with Latency Sensitivity in Collaborative Cloud-Edge Computing Networks
    Liu, Ling
    Ma, Weike
    Chen, Bowen
    Gao, Mingyi
    Chen, Hong
    Wu, Jinbing
    2020 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP) AND INTERNATIONAL CONFERENCE ON INFORMATION PHOTONICS AND OPTICAL COMMUNICATIONS (IPOC), 2020,