POS: An Operator Scheduling Framework for Multi-model Inference on Edge Intelligent Computing

被引:5
|
作者
Zhang, Ziyang [1 ]
Li, Huan [2 ]
Zhao, Yang [2 ]
Lin, Changyao [1 ]
Liu, Jie [2 ]
机构
[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China
[2] Harbin Inst Technol, Shenzhen, Guangdong, Peoples R China
基金
国家重点研发计划;
关键词
edge computing; multi-model inference; operator scheduling; deep reinforcement earning;
D O I
10.1145/3583120.3586953
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Edge intelligent applications, such as autonomous driving usually deploy multiple inference models on resource-constrained edge devices to execute a diverse range of concurrent tasks, given large amounts of input data. One challenge is that these tasks need to produce reliable inference results simultaneously with millisecond-level latency to achieve real-time performance and high quality of service (QoS). However, most of the existing deep learning frameworks only focus on optimizing a single inference model on an edge device. To accelerate multi-model inference on a resource-constrained edge device, in this paper we propose POS, a novel operator-level scheduling framework that combines four operator scheduling strategies. The key to POS is a maximum entropy reinforcement learning-based operator scheduling algorithm MEOS, which generates an optimal schedule automatically. Extensive experiments show that POS outperforms five state-of-the-art inference frameworks: TensorFlow, PyTorch, TensorRT, TVM, and IOS, by up to 1.2x similar to 3.9x inference speedup consistently, with 40% improvement on GPU utilization. Meanwhile, MEOS reduces the scheduling overhead by 37% on average, compared to five baseline methods including sequential execution, dynamic programming, greedy scheduling, actor-critic, and coordinate descent search algorithms.
引用
收藏
页码:40 / 52
页数:13
相关论文
共 50 条
  • [21] Multi-model fused framework for image annotation
    Chen, Z. (jingzhang@ecust.edu.cn), 1600, Institute of Computing Technology (26):
  • [22] A multi-model framework for the Arabidopsis life cycle
    Zardilis, Argyris
    Hume, Alastair
    Millar, Andrew J.
    JOURNAL OF EXPERIMENTAL BOTANY, 2019, 70 (09) : 2463 - 2477
  • [23] Towards a multi-model views security framework
    Xia, Lei
    Huang, Hao
    Yu, Shuying
    SECRYPT 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY, 2007, : 98 - 101
  • [24] CompuCell, a multi-model framework for simulation of morphogenesis
    Izaguirre, JA
    Chaturvedi, R
    Huang, C
    Cickovski, T
    Coffland, J
    Thomas, G
    Forgacs, G
    Alber, M
    Hentschel, G
    Newman, SA
    Glazier, JA
    BIOINFORMATICS, 2004, 20 (07) : 1129 - 1137
  • [25] Intelligent Scheduling Strategies for Computing Power Resources in Heterogeneous Edge Networks
    Ji, Zhixiang
    Zhang, Jie
    Wang, Xiaohui
    DATA SCIENCE (ICPCSEE 2022), PT II, 2022, 1629 : 253 - 271
  • [26] Caching-based task scheduling for edge computing in intelligent manufacturing
    Zhongmin Wang
    Gang Wang
    Xiaomin Jin
    Xiang Wang
    Jianwei Wang
    The Journal of Supercomputing, 2022, 78 : 5095 - 5117
  • [27] Caching-based task scheduling for edge computing in intelligent manufacturing
    Wang, Zhongmin
    Wang, Gang
    Jin, Xiaomin
    Wang, Xiang
    Wang, Jianwei
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (04): : 5095 - 5117
  • [28] Sampling-Based SAT/ASP Multi-Model Optimization as a Framework for Probabilistic Inference (Extended Abstract)
    Nickles, Matthias
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2020, (325):
  • [29] A FRAMEWORK FOR MULTI-MODEL SURGICAL WORKFLOW MANAGEMENT
    Franke, S.
    Neumuth, T.
    BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK, 2013, 58
  • [30] A FRAMEWORK FOR MULTI-MODEL SURGICAL WORKFLOW MANAGEMENT
    Franke, S.
    Neumuth, T.
    BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK, 2013, 58