Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

被引:5
|
作者
Wu, Yongji [1 ]
Lentz, Matthew [1 ]
Zhuo, Danyang [1 ]
Lu, Yao [2 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
[2] Microsoft Res, Redmond, WA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 16卷 / 03期
关键词
68;
D O I
10.14778/3570690.3570692
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge datacenters, and cloud datacenters. On the other hand, recent AutoML efforts have provided viable solutions for model compression, pruning and quantization for heterogeneous environments; for a machine learning model, now we may easily find or even generate a series of model variants with different tradeoffs between accuracy and efficiency. We design and implement JellyBean, a system for serving and optimizing machine learning inference workflows on heterogeneous infrastructures. Given service-level objectives (e.g., throughput, accuracy), JellyBean picks the most cost-efficient models that meet the accuracy target and decides how to deploy them across different tiers of infrastructures. Evaluations show that JellyBean reduces the total serving cost of visual question answering by up to 58% and vehicle tracking from the NVIDIA AI City Challenge by up to 36%, compared with state-of-the-art model selection and worker assignment solutions. JellyBean also outperforms prior ML serving systems (e.g., Spark on the cloud) up to 5x in serving costs.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 50 条
  • [31] An Interface Design Methodology for Serving Machine Learning Models
    Ogbuju, Emeka
    Ihinkalu, Olalekan
    Oladipo, Francisca
    PROCEEDINGS OF THE 4TH AFRICAN CONFERENCE FOR HUMAN COMPUTER INTERACTION, AFRICHI 2023, 2023, : 12 - 14
  • [32] A Tensor Compiler for Unified Machine Learning Prediction Serving
    Nakandala, Supun
    Saur, Karla
    Yu, Gyeong-In
    Karanasos, Konstantinos
    Curino, Carlo
    Weimer, Markus
    Interlandi, Matteo
    PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), 2020, : 899 - 917
  • [33] Adaptive resource provisioning method using application-aware machine learning based on job history in heterogeneous infrastructures
    Jieun Choi
    Yoonhee Kim
    Cluster Computing, 2017, 20 : 3537 - 3549
  • [34] OPTIMIZING DRUG SCREENING WITH MACHINE LEARNING
    Chen Lin
    Zhou Xiaoxiao
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [35] Adaptive resource provisioning method using application-aware machine learning based on job history in heterogeneous infrastructures
    Choi, Jieun
    Kim, Yoonhee
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (04): : 3537 - 3549
  • [36] Optimizing Data Collection for Machine Learning
    Mahmood, Rafid
    Lucas, James
    Alvarez, Jose M.
    Fidler, Sanja
    Law, Marc T.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [37] Accelerating Cancer Histopathology Workflows with Chemical Imaging and Machine Learning
    Falahkheirkhah, Kianoush
    Mukherjee, Sudipta S.
    Gupta, Sounak
    Herrera-Hernandez, Loren
    Mccarthy, Michael R.
    Jimenez, Rafael E.
    Cheville, John C.
    Bhargava, Rohit
    CANCER RESEARCH COMMUNICATIONS, 2023, 3 (09): : 1875 - 1887
  • [38] Semantic Description of Explainable Machine Learning Workflows for Improving Trust
    Nakagawa, Patricia Inoue
    Pires, Luis Ferreira
    Moreira, Joao Luiz Rebelo
    Santos, Luiz Olavo Bonino da Silva
    Bukhsh, Faiza
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [39] Heterogeneous transfer learning techniques for machine learning
    Muhammad Shahid Iqbal
    Bin Luo
    Tamoor Khan
    Rashid Mehmood
    Muhammad Sadiq
    Iran Journal of Computer Science, 2018, 1 (1) : 31 - 46
  • [40] Optimizing Federated Learning with Heterogeneous Edge Devices
    Islam, Mohammad Munzurul
    Alawad, Mohammed
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,