ApproxDNN: Incentivizing DNN Approximation in Cloud

被引：3

作者：

Nabavinejad, Seyed Morteza ^{[1
]}

Mashayekhy, Lena ^{[2
]}

Reda, Sherief ^{[3
]}

机构：

[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran

[2] Univ Delaware, Dept Comp & Informat Sci, Newark, DE 19716 USA

[3] Brown Univ, Sch Engn, Providence, RI 02912 USA

来源：

2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020) | 2020年

关键词：

cloud computing; approximate computing; deep neural network; cost minimization; MECHANISM;

D O I：

10.1109/CCGrid49817.2020.00-29

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Service providers leverage discounted prices of reserved instances offered by cloud providers to amortize their operational costs. They reserve a certain number of instances to cover a significant portion of their computing resource requirements, and further employ on-demand instances to cover remaining requirements not satisfied by the reserved instances. Because of the higher price of on-demand instances, service providers seek to lower their usage to minimize operational costs. In this work, we propose ApproxDNN approach for Machine Learning as a Service to reduce operational costs of service providers by incentivizing approximate results, based on the capabilities of cutting-edge GPUs and a discounted pricing model. When the deadlines of jobs submitted by users are very tight, a service provider might not be able to execute all of them on reserved instances under the default precision. In such cases, ApproxDNN leverages the reduced-precision instructions to reduce the execution time of the jobs with slight reduction in their final accuracy, and consequently, to minimize the employment of on-demand instances. To incentivize users to accept the approximate results of reduced-precision instructions, ApproxDNN offers them a discounted price for the service based on a newly designed pricing model. Our proposed pricing model of ApproxDNN guarantees lower or equal cost for service providers compared to the conventional method that solely depends on employment of on-demand instances in case of the reserved instance shortage. We employ real-world traces to conduct an extensive set of experiments and evaluate the performance of our proposed approach. The results show that ApproxDNN reduces the cost of service providers by 18%, while never exceeding the cost of the conventional method and slightly affecting the accuracy by 0.14%.

引用

页码：639 / 648

页数：10

共 50 条

[1] Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs
Chen, Yao
He, Jiong
Zhang, Xiaofan
Hao, Cong
Chen, Deming
[J]. PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, : 73 - 82
[2] Incentivizing Self-Capping to Increase Cloud Utilization
Shahrad, Mohammad
Klein, Cristian
Zheng, Liang
Chiang, Mung
Elmroth, Erik
Wentzlaff, David
[J]. PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 52 - 65
[3] Control Variate Approximation for DNN Accelerators
Zervakis, Georgios
Spantidi, Ourania
Anagnostopoulos, Iraklis
Amrouch, Hussam
Henkel, Joerg
[J]. 2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 481 - 486
[4] Increasing Throughput of In-Memory DNN Accelerators by Flexible Layerwise DNN Approximation
De la Parra, Cecilia
Soliman, Taha
Guntoro, Andre
Kumar, Akash
Wehn, Norbert
[J]. IEEE MICRO, 2022, 42 (06) : 17 - 24
[5] Special Session: Approximation and Fault Resiliency of DNN Accelerators
Ahmadilivani, Mohammad Hasan
Barbareschi, Mario
Barone, Salvatore
Bosio, Alberto
Daneshtalab, Masoud
Della Torca, Salvatore
Gavarini, Gabriele
Jenihhin, Maksim
Raik, Jaan
Ruospo, Annachiara
Sanchez, Ernesto
Taheri, Mahdi
[J]. 2023 IEEE 41ST VLSI TEST SYMPOSIUM, VTS, 2023,
[6] Accelerating DNN Inference by Edge-Cloud Collaboration
Chen, Jianan
Qi, Qi
Wang, Jingyu
Sun, Haifeng
Liao, Jianxin
[J]. 2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
[7] Characterizing DNN Models for Edge-Cloud Computing
Xia, Chunwei
Zhao, Jiacheng
Cui, Huimin
Feng, Xiaobing
[J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2018, : 82 - 83
[8] APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors
Taheri, Mahdi
Ahmadilivani, Mohammad Hasan
Jenihhin, Maksim
Daneshtalab, Masoud
Raik, Jaan
[J]. 2023 26TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, DDECS, 2023, : 124 - 127
[9] Scheduling DNN Inferencing on Edge and Cloud for Personalized UAV Fleets
Raj, Suman
Gupta, Harshil
Simmhan, Yogesh
[J]. 2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID, 2023, : 615 - 626
[10] MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators
Leon, Vasileios
Makris, Georgios
Xydis, Sotirios
Pekmestzi, Kiamal
Soudris, Dimitrios
[J]. 2022 IEEE 13TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS), 2022, : 61 - 64

← 1 2 3 4 5 →