Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

被引:5
|
作者
Wu, Yongji [1 ]
Lentz, Matthew [1 ]
Zhuo, Danyang [1 ]
Lu, Yao [2 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
[2] Microsoft Res, Redmond, WA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 16卷 / 03期
关键词
68;
D O I
10.14778/3570690.3570692
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge datacenters, and cloud datacenters. On the other hand, recent AutoML efforts have provided viable solutions for model compression, pruning and quantization for heterogeneous environments; for a machine learning model, now we may easily find or even generate a series of model variants with different tradeoffs between accuracy and efficiency. We design and implement JellyBean, a system for serving and optimizing machine learning inference workflows on heterogeneous infrastructures. Given service-level objectives (e.g., throughput, accuracy), JellyBean picks the most cost-efficient models that meet the accuracy target and decides how to deploy them across different tiers of infrastructures. Evaluations show that JellyBean reduces the total serving cost of visual question answering by up to 58% and vehicle tracking from the NVIDIA AI City Challenge by up to 36%, compared with state-of-the-art model selection and worker assignment solutions. JellyBean also outperforms prior ML serving systems (e.g., Spark on the cloud) up to 5x in serving costs.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 50 条
  • [21] Publishing and Serving Machine Learning Models with DLHub
    Chard, Ryan
    Ward, Logan
    Li, Zhuozhao
    Babuji, Yadu
    Woodard, Anna
    Tuecke, Steven
    Chard, Kyle
    Blaiszik, Ben
    Foster, Ian
    PEARC '19: PROCEEDINGS OF THE PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING ON RISE OF THE MACHINES (LEARNING), 2019,
  • [22] Machine learning and serving of discrete field theories
    Hong Qin
    Scientific Reports, 10
  • [23] Machine learning and serving of discrete field theories
    Qin, Hong
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [24] Diagnosis Recommendation Using Machine Learning Scientific Workflows
    Ahmed, Ishtiaq
    Lu, Shiyong
    Bai, Changxin
    Bhuyan, Fahima Amin
    2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 82 - 90
  • [25] Machine Learning Workflows in the Computing Continuum for Environmental Monitoring
    Catalfamo, Alessio
    Aral, Atakan
    Brandic, Ivona
    Deelman, Ewa
    Villari, Massimo
    COMPUTATIONAL SCIENCE, ICCS 2024, PT V, 2024, 14836 : 368 - 382
  • [26] Supporting the Design of Machine Learning Workflows with a Recommendation System
    Jannach, Dietmar
    Jugovac, Michael
    Lerche, Lukas
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 6 (01)
  • [27] Towards Semantic Description of Explainable Machine Learning Workflows
    Nakagawa, Patricia Inoue
    Pires, Luis Ferreira
    Rebelo Moreira, Joao Luiz
    Bonino, Luiz Olavo
    2021 IEEE 25TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE WORKSHOPS (EDOCW 2021), 2021, : 236 - 244
  • [28] Multi-Objective Evolution of Machine Learning Workflows
    Kren, Tomas
    Pilat, Martin
    Neruda, Roman
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017,
  • [29] On Optimizing Resources for Real-Time End-to-End Machine Learning in Heterogeneous Edges
    Nguyen, Minh-Tri
    Truong, Hong-Linh
    SOFTWARE-PRACTICE & EXPERIENCE, 2025, 55 (03): : 541 - 558
  • [30] Evaluating and optimizing of steam ejector performance considering heterogeneous condensation using machine learning framework
    Dolatabadi, Amir Momeni
    Mottahedi, Hamid Reza
    Aliabadi, Mohammad Ali Faghih
    Pour, Mohsen Saffari
    Wen, Chuang
    Akrami, Mohammad
    ENERGY, 2024, 305