Multi-Model Running Latency Optimization in an Edge Computing Paradigm

被引:12
|
作者
Li, Peisong [1 ]
Wang, Xinheng [1 ]
Huang, Kaizhu [2 ]
Huang, Yi [3 ]
Li, Shancang [4 ]
Iqbal, Muddesar [5 ]
机构
[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou 215123, Peoples R China
[2] Duke Kunshan Univ, Data Sci Res Ctr, Div Nat & Appl Sci, Suzhou 215316, Peoples R China
[3] Univ Liverpool, Dept Elect Engn & Elect, Liverpool L69 3BX, Merseyside, England
[4] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 3AT, Wales
[5] Prince Sultan Univ, Coll Engn, Commun & Networks Engn Dept, Renewable Energy Lab, Riyadh 11586, Saudi Arabia
基金
中国国家自然科学基金;
关键词
edge computing; latency optimization; multi-model; task scheduling; autonomous driving; AI;
D O I
10.3390/s22166097
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Recent advances in both lightweight deep learning algorithms and edge computing increasingly enable multiple model inference tasks to be conducted concurrently on resource-constrained edge devices, allowing us to achieve one goal collaboratively rather than getting high quality in each standalone task. However, the high overall running latency for performing multi-model inferences always negatively affects the real-time applications. To combat latency, the algorithms should be optimized to minimize the latency for multi-model deployment without compromising the safety-critical situation. This work focuses on the real-time task scheduling strategy for multi-model deployment and investigating the model inference using an open neural network exchange (ONNX) runtime engine. Then, an application deployment strategy is proposed based on the container technology and inference tasks are scheduled to different containers based on the scheduling strategies. Experimental results show that the proposed solution is able to significantly reduce the overall running latency in real-time applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Multiagent Reinforcement Learning-Based Multimodel Running Latency Optimization in Vehicular Edge Computing Paradigm
    Li, Peisong
    Xiao, Ziren
    Wang, Xinheng
    Iqbal, Muddesar
    Casaseca-de-la-Higuera, Pablo
    IEEE SYSTEMS JOURNAL, 2024, 18 (04): : 1860 - 1870
  • [2] POS: An Operator Scheduling Framework for Multi-model Inference on Edge Intelligent Computing
    Zhang, Ziyang
    Li, Huan
    Zhao, Yang
    Lin, Changyao
    Liu, Jie
    PROCEEDINGS OF THE 2023 THE 22ND INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS, IPSN 2023, 2023, : 40 - 52
  • [3] Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing
    Nguyen, Dinh C.
    Hosseinalipour, Seyyedali
    Love, David J.
    Pathirana, Pubudu N.
    Brinton, Christopher G.
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2022, 40 (12) : 3373 - 3390
  • [4] Latency Optimization for Mobile Edge Computing with Dynamic Energy Harvesting
    Sun, Yifei
    Wu, Jigang
    Chen, Long
    Liu, Tonglai
    Yao, Mianyang
    Sun, Weijun
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 79 - 83
  • [5] Latency Aware Placement in Multi-access Edge Computing
    Harris, Dor
    Naor, Joseph
    Raz, Danny
    2018 4TH IEEE CONFERENCE ON NETWORK SOFTWARIZATION AND WORKSHOPS (NETSOFT), 2018, : 132 - 140
  • [6] Multi-Access Edge Computing: An Overview and Latency Evaluation
    Miladinovic, Igor
    Schefer-Wenzl, Sigrid
    Burger, Thomas
    Hirner, Heimo
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2021, : 744 - 748
  • [7] Towards a Multi-model Paradigm for Business Process Management
    Alman, Anti
    Maggi, Fabrizio Maria
    Rinderle-Ma, Stefanie
    Rivkin, Andrey
    Winter, Karolin
    ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2024, 2024, 14663 : 178 - 194
  • [8] Beyond boundaries a hybrid cellular potts and particle swarm optimization model for energy and latency optimization in edge computing
    Sahu, Dinesh
    Nidhi, Shiv
    Prakash, Shiv
    Sinha, Priyanshu
    Yang, Tiansheng
    Rathore, Rajkumar Singh
    Wang, Lu
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [9] Latency and Reliability Oriented Collaborative Optimization for Multi-UAV Aided Mobile Edge Computing System
    Hou, Xiangwang
    Ren, Zhiyuan
    Wang, Jingjing
    Zheng, Shuya
    Mang, Hailin
    IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2020, : 150 - 156
  • [10] Multi-UAV-Assisted Offloading for Joint Optimization of Energy Consumption and Latency in Mobile Edge Computing
    Tang, Qiang
    Wen, Sihao
    He, Shiming
    Yang, Kun
    IEEE SYSTEMS JOURNAL, 2024, 18 (02): : 1414 - 1425