Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

被引:5
|
作者
Wu, Yongji [1 ]
Lentz, Matthew [1 ]
Zhuo, Danyang [1 ]
Lu, Yao [2 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
[2] Microsoft Res, Redmond, WA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 16卷 / 03期
关键词
68;
D O I
10.14778/3570690.3570692
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge datacenters, and cloud datacenters. On the other hand, recent AutoML efforts have provided viable solutions for model compression, pruning and quantization for heterogeneous environments; for a machine learning model, now we may easily find or even generate a series of model variants with different tradeoffs between accuracy and efficiency. We design and implement JellyBean, a system for serving and optimizing machine learning inference workflows on heterogeneous infrastructures. Given service-level objectives (e.g., throughput, accuracy), JellyBean picks the most cost-efficient models that meet the accuracy target and decides how to deploy them across different tiers of infrastructures. Evaluations show that JellyBean reduces the total serving cost of visual question answering by up to 58% and vehicle tracking from the NVIDIA AI City Challenge by up to 36%, compared with state-of-the-art model selection and worker assignment solutions. JellyBean also outperforms prior ML serving systems (e.g., Spark on the cloud) up to 5x in serving costs.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 50 条
  • [41] Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows
    Xin, Doris
    Wu, Eva Yiwei
    Lee, Doris Jung-Lin
    Salehi, Niloufar
    Parameswaran, Aditya
    CHI '21: PROCEEDINGS OF THE 2021 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2021,
  • [42] A Taxonomy of Machine Learning Fairness Tool Specifications, Features and Workflows
    Mim, Sadia Afrin
    Smith, Justin
    Johnson, Brittany
    2023 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING, VL/HCC, 2023, : 222 - 225
  • [43] Modular performance prediction for scientific workflows using Machine Learning
    Singh, Alok
    Purawat, Shweta
    Rao, Arvind
    Altintas, Ilkay
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 : 1 - 14
  • [44] Integrating a Machine Learning System Into Clinical Workflows: Qualitative Study
    Sandhu, Sahil
    Lin, Anthony L.
    Brajer, Nathan
    Sperling, Jessica
    Ratliff, William
    Bedoya, Armando D.
    Balu, Suresh
    O'Brien, Cara
    Sendak, Mark P.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (11)
  • [45] Designing machine learning workflows with an application to topological data analysis
    Cawi, Eric
    La Rosa, Patricio S.
    Nehorai, Arye
    PLOS ONE, 2019, 14 (12):
  • [46] Machine Learning Potentials for Heterogeneous Catalysis
    Omranpour, Amir
    Elsner, Jan
    Lausch, K. Nikolas
    Behler, Jorg
    ACS CATALYSIS, 2025, 15 (03): : 1616 - 1634
  • [47] Synthesis and Machine Learning for Heterogeneous Extraction
    Iyer, Arun
    Jonnalagedda, Manohar
    Parthasarathy, Suresh
    Radhakrishna, Arjun
    Rajamani, Sriram K.
    PROCEEDINGS OF THE 40TH ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '19), 2019, : 301 - 315
  • [48] Optimizing Mobility and Transport Infrastructures
    Bilotta, Stefano
    Collini, Enrico
    Palesi, Luciano Alessandro Ipsaro
    Nesi, Paolo
    ERCIM NEWS, 2024, (138):
  • [49] Declarative Data Serving: The Future of Machine Learning Inference on the Edge
    Shaowang, Ted
    Jain, Nilesh
    Matthews, Dennis D.
    Krishnan, Sanjay
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (11): : 2555 - 2562
  • [50] The Impact of Heterogeneous Technology on Machine Learning
    Mo, Jin-ping
    Yang, Qing-lin
    Yang, Xiao-lei
    Qian, Wen-biao
    2018 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND NETWORK TECHNOLOGY (CCNT 2018), 2018, 291 : 134 - 138