Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

被引：5

作者：

Wu, Yongji ^{[1
]}

Lentz, Matthew ^{[1
]}

Zhuo, Danyang ^{[1
]}

Lu, Yao ^{[2
]}

机构：

[1] Duke Univ, Durham, NC 27706 USA

[2] Microsoft Res, Redmond, WA USA

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 16卷 / 03期

关键词：

68;

D O I：

10.14778/3570690.3570692

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge datacenters, and cloud datacenters. On the other hand, recent AutoML efforts have provided viable solutions for model compression, pruning and quantization for heterogeneous environments; for a machine learning model, now we may easily find or even generate a series of model variants with different tradeoffs between accuracy and efficiency. We design and implement JellyBean, a system for serving and optimizing machine learning inference workflows on heterogeneous infrastructures. Given service-level objectives (e.g., throughput, accuracy), JellyBean picks the most cost-efficient models that meet the accuracy target and decides how to deploy them across different tiers of infrastructures. Evaluations show that JellyBean reduces the total serving cost of visual question answering by up to 58% and vehicle tracking from the NVIDIA AI City Challenge by up to 36%, compared with state-of-the-art model selection and worker assignment solutions. JellyBean also outperforms prior ML serving systems (e.g., Spark on the cloud) up to 5x in serving costs.

引用

页码：406 / 419

页数：14

共 50 条

[31] An Interface Design Methodology for Serving Machine Learning Models
Ogbuju, Emeka
Ihinkalu, Olalekan
Oladipo, Francisca
PROCEEDINGS OF THE 4TH AFRICAN CONFERENCE FOR HUMAN COMPUTER INTERACTION, AFRICHI 2023, 2023, : 12 - 14
[32] A Tensor Compiler for Unified Machine Learning Prediction Serving
Nakandala, Supun
Saur, Karla
Yu, Gyeong-In
Karanasos, Konstantinos
Curino, Carlo
Weimer, Markus
Interlandi, Matteo
PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), 2020, : 899 - 917
[33] Adaptive resource provisioning method using application-aware machine learning based on job history in heterogeneous infrastructures
Jieun Choi
Yoonhee Kim
Cluster Computing, 2017, 20 : 3537 - 3549
[34] OPTIMIZING DRUG SCREENING WITH MACHINE LEARNING
Chen Lin
Zhou Xiaoxiao
2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
[35] Adaptive resource provisioning method using application-aware machine learning based on job history in heterogeneous infrastructures
Choi, Jieun
Kim, Yoonhee
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (04): : 3537 - 3549
[36] Optimizing Data Collection for Machine Learning
Mahmood, Rafid
Lucas, James
Alvarez, Jose M.
Fidler, Sanja
Law, Marc T.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[37] Accelerating Cancer Histopathology Workflows with Chemical Imaging and Machine Learning
Falahkheirkhah, Kianoush
Mukherjee, Sudipta S.
Gupta, Sounak
Herrera-Hernandez, Loren
Mccarthy, Michael R.
Jimenez, Rafael E.
Cheville, John C.
Bhargava, Rohit
CANCER RESEARCH COMMUNICATIONS, 2023, 3 (09): : 1875 - 1887
[38] Semantic Description of Explainable Machine Learning Workflows for Improving Trust
Nakagawa, Patricia Inoue
Pires, Luis Ferreira
Moreira, Joao Luiz Rebelo
Santos, Luiz Olavo Bonino da Silva
Bukhsh, Faiza
APPLIED SCIENCES-BASEL, 2021, 11 (22):
[39] Heterogeneous transfer learning techniques for machine learning
Muhammad Shahid Iqbal
Bin Luo
Tamoor Khan
Rashid Mehmood
Muhammad Sadiq
Iran Journal of Computer Science, 2018, 1 (1) : 31 - 46
[40] Optimizing Federated Learning with Heterogeneous Edge Devices
Islam, Mohammad Munzurul
Alawad, Mohammed
2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,

← 1 2 3 4 5 →