POS: An Operator Scheduling Framework for Multi-model Inference on Edge Intelligent Computing

被引：5

作者：

Zhang, Ziyang ^{[1
]}

Li, Huan ^{[2
]}

Zhao, Yang ^{[2
]}

Lin, Changyao ^{[1
]}

Liu, Jie ^{[2
]}

机构：

[1] Harbin Inst Technol, Harbin, Heilongjiang, Peoples R China

[2] Harbin Inst Technol, Shenzhen, Guangdong, Peoples R China

来源：

PROCEEDINGS OF THE 2023 THE 22ND INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS, IPSN 2023 | 2023年

基金：

国家重点研发计划;

关键词：

edge computing; multi-model inference; operator scheduling; deep reinforcement earning;

D O I：

10.1145/3583120.3586953

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Edge intelligent applications, such as autonomous driving usually deploy multiple inference models on resource-constrained edge devices to execute a diverse range of concurrent tasks, given large amounts of input data. One challenge is that these tasks need to produce reliable inference results simultaneously with millisecond-level latency to achieve real-time performance and high quality of service (QoS). However, most of the existing deep learning frameworks only focus on optimizing a single inference model on an edge device. To accelerate multi-model inference on a resource-constrained edge device, in this paper we propose POS, a novel operator-level scheduling framework that combines four operator scheduling strategies. The key to POS is a maximum entropy reinforcement learning-based operator scheduling algorithm MEOS, which generates an optimal schedule automatically. Extensive experiments show that POS outperforms five state-of-the-art inference frameworks: TensorFlow, PyTorch, TensorRT, TVM, and IOS, by up to 1.2x similar to 3.9x inference speedup consistently, with 40% improvement on GPU utilization. Meanwhile, MEOS reduces the scheduling overhead by 37% on average, compared to five baseline methods including sequential execution, dynamic programming, greedy scheduling, actor-critic, and coordinate descent search algorithms.

引用

页码：40 / 52

页数：13

共 50 条

[41] Data model descriptions and translation signatures in a multi-model framework
Paolo Atzeni
Giorgio Gianforme
Paolo Cappellari
Annals of Mathematics and Artificial Intelligence, 2011, 63 : 287 - 315
[42] Data model descriptions and translation signatures in a multi-model framework
Atzeni, Paolo
Gianforme, Giorgio
Cappellari, Paolo
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2011, 63 (3-4) : 287 - 315
[43] An intelligent scheduling framework for DNN task acceleration in heterogeneous edge networks
Feng, Yiming
Hu, Shihong
Chen, Lingqiang
Li, Guanghui
COMPUTER COMMUNICATIONS, 2023, 201 : 91 - 101
[44] Optimum Selection of DNN Model and Framework for Edge Inference
Velasco-Montero, Delia
Fernandez-Berni, Jorge
Carmona-Galan, Ricardo
Rodriguez-Vazquez, Angel
IEEE ACCESS, 2018, 6 : 51680 - 51692
[45] UAV Swarm Centroid Tracking for Edge Computing Applications Using GRU-Assisted Multi-Model Filtering
Chen, Yudi
Liu, Xiangyu
Li, Changqing
Zhu, Jiao
Wu, Min
Su, Xiang
ELECTRONICS, 2024, 13 (06)
[46] MOBILE EDGE COMPUTING FOR THE INTERNET OF VEHICLES Offloading Framework and Job Scheduling
Feng, Jingyun
Liu, Zhi
Wu, Celimuge
Ji, Yusheng
IEEE VEHICULAR TECHNOLOGY MAGAZINE, 2019, 14 (01): : 28 - 36
[47] Strategic Review and Framework of Task Scheduling Algorithms in Mobile Edge Computing
Muthukumari, S. M.
Raj, E. George Dharma Prakash
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (03): : 405 - 412
[48] An Efficient Scheduling Strategy for Collaborative Cloud and Edge Computing in System of Intelligent Buildings
Feng, Xiaodong
Yi, Lingzhi
Liu, Ning
Gao, Xieyi
Liu, Weiwei
Wang, Bin
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2023, 27 (05) : 948 - 958
[49] An intelligent detection method for assembly based on multi-model cascade
Xu, Hanzhong
Wu, Dianliang
Zou, Kai
Yu, Qihang
Yu, Haiwen
2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 280 - 286
[50] Task Offloading and Scheduling Strategy for Intelligent Prosthesis in Mobile Edge Computing Environment
Qi, Ping
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022

← 1 2 3 4 5 →