ApproxDNN: Incentivizing DNN Approximation in Cloud

被引:3
|
作者
Nabavinejad, Seyed Morteza [1 ]
Mashayekhy, Lena [2 ]
Reda, Sherief [3 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA
[3] Brown Univ, Sch Engn, Providence, RI 02912 USA
关键词
cloud computing; approximate computing; deep neural network; cost minimization; MECHANISM;
D O I
10.1109/CCGrid49817.2020.00-29
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Service providers leverage discounted prices of reserved instances offered by cloud providers to amortize their operational costs. They reserve a certain number of instances to cover a significant portion of their computing resource requirements, and further employ on-demand instances to cover remaining requirements not satisfied by the reserved instances. Because of the higher price of on-demand instances, service providers seek to lower their usage to minimize operational costs. In this work, we propose ApproxDNN approach for Machine Learning as a Service to reduce operational costs of service providers by incentivizing approximate results, based on the capabilities of cutting-edge GPUs and a discounted pricing model. When the deadlines of jobs submitted by users are very tight, a service provider might not be able to execute all of them on reserved instances under the default precision. In such cases, ApproxDNN leverages the reduced-precision instructions to reduce the execution time of the jobs with slight reduction in their final accuracy, and consequently, to minimize the employment of on-demand instances. To incentivize users to accept the approximate results of reduced-precision instructions, ApproxDNN offers them a discounted price for the service based on a newly designed pricing model. Our proposed pricing model of ApproxDNN guarantees lower or equal cost for service providers compared to the conventional method that solely depends on employment of on-demand instances in case of the reserved instance shortage. We employ real-world traces to conduct an extensive set of experiments and evaluate the performance of our proposed approach. The results show that ApproxDNN reduces the cost of service providers by 18%, while never exceeding the cost of the conventional method and slightly affecting the accuracy by 0.14%.
引用
收藏
页码:639 / 648
页数:10
相关论文
共 50 条
  • [1] Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs
    Chen, Yao
    He, Jiong
    Zhang, Xiaofan
    Hao, Cong
    Chen, Deming
    [J]. PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, : 73 - 82
  • [2] Incentivizing Self-Capping to Increase Cloud Utilization
    Shahrad, Mohammad
    Klein, Cristian
    Zheng, Liang
    Chiang, Mung
    Elmroth, Erik
    Wentzlaff, David
    [J]. PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 52 - 65
  • [3] Control Variate Approximation for DNN Accelerators
    Zervakis, Georgios
    Spantidi, Ourania
    Anagnostopoulos, Iraklis
    Amrouch, Hussam
    Henkel, Joerg
    [J]. 2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 481 - 486
  • [4] Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation
    De la Parra, Cecilia
    Soliman, Taha
    Guntoro, Andre
    Kumar, Akash
    Wehn, Norbert
    [J]. IEEE MICRO, 2022, 42 (06) : 17 - 24
  • [5] Special Session: Approximation and Fault Resiliency of DNN Accelerators
    Ahmadilivani, Mohammad Hasan
    Barbareschi, Mario
    Barone, Salvatore
    Bosio, Alberto
    Daneshtalab, Masoud
    Della Torca, Salvatore
    Gavarini, Gabriele
    Jenihhin, Maksim
    Raik, Jaan
    Ruospo, Annachiara
    Sanchez, Ernesto
    Taheri, Mahdi
    [J]. 2023 IEEE 41ST VLSI TEST SYMPOSIUM, VTS, 2023,
  • [6] Accelerating DNN Inference by Edge-Cloud Collaboration
    Chen, Jianan
    Qi, Qi
    Wang, Jingyu
    Sun, Haifeng
    Liao, Jianxin
    [J]. 2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
  • [7] Characterizing DNN Models for Edge-Cloud Computing
    Xia, Chunwei
    Zhao, Jiacheng
    Cui, Huimin
    Feng, Xiaobing
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2018, : 82 - 83
  • [8] APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors
    Taheri, Mahdi
    Ahmadilivani, Mohammad Hasan
    Jenihhin, Maksim
    Daneshtalab, Masoud
    Raik, Jaan
    [J]. 2023 26TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, DDECS, 2023, : 124 - 127
  • [9] Scheduling DNN Inferencing on Edge and Cloud for Personalized UAV Fleets
    Raj, Suman
    Gupta, Harshil
    Simmhan, Yogesh
    [J]. 2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID, 2023, : 615 - 626
  • [10] MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators
    Leon, Vasileios
    Makris, Georgios
    Xydis, Sotirios
    Pekmestzi, Kiamal
    Soudris, Dimitrios
    [J]. 2022 IEEE 13TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS), 2022, : 61 - 64