Coordinated DVFS and Precision Control for Deep Neural Networks

被引:18
|
作者
Nabavinejad, Seyed Morteza [1 ]
Hafez-Kolahi, Hassan [2 ]
Reda, Sherief [3 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Brown Univ, Sch Engn, Providence, RI 02912 USA
基金
美国国家科学基金会;
关键词
Graphics processing units; Power demand; Time factors; Runtime; Time-frequency analysis; Servers; Neural networks; Deep neural network; hardware accelerator; power; accuracy; response time;
D O I
10.1109/LCA.2019.2942020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Traditionally, DVFS has been the main mechanism to trade-off performance and power. We observe that Deep Neural Network (DNN) applications offer the possibility to trade-off performance, power, and accuracy using both DVFS and numerical precision levels. Our proposed approach, Power-Inference accuracy Trading (PIT), monitors the servers load, and accordingly adjusts the precision of the DNN model and the DVFS setting of GPU to trade-off the accuracy and power consumption with response time. At high loads and tight request arrivals, PIT leverages INT8-precision instructions of GPU to dynamically change the precision of deployed DNN models and boosts GPU frequency to execute the requests faster at the expense of accuracy reduction and high power consumption. However, when the requests arrival rate is relaxed and there is slack time for requests, PIT deploys high precision version of models to improve the accuracy and reduces GPU frequency to decrease power consumption. We implement and deploy PIT on a state-of-the-art server equipped with a Tesla P40 GPU. Experimental results demonstrate that depending on the load, PIT can improve response time up to 11 percent compared to a job scheduler that uses only FP32 precision. It also improves the energy consumption by up to 28 percent, while achieving around 99.5 percent accuracy of sole FP32-precision.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [21] On Local Entropy, Stochastic Control, and Deep Neural Networks
    Pavon, Michele
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 437 - 441
  • [22] A greenhouse modeling and control using deep neural networks
    Salah, Latifa Belhaj
    Fourati, Fathi
    APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (15) : 1905 - 1929
  • [23] Multi-objective Precision Optimization of Deep Neural Networks for Edge Devices
    Nhut-Minh Ho
    Vaddi, Ramesh
    Wong, Weng-Fai
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1100 - 1105
  • [24] Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks
    Doerrich, Marion
    Fan, Mingcheng
    Kist, Andreas M.
    IEEE ACCESS, 2023, 11 : 57627 - 57634
  • [25] POSITNN: TRAINING DEEP NEURAL NETWORKS WITH MIXED LOW-PRECISION POSIT
    Raposo, Goncalo
    Tomas, Pedro
    Roma, Nuno
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7908 - 7912
  • [26] INA: Incremental Network Approximation Algorithm for Limited Precision Deep Neural Networks
    Liu, Zheyu
    Jia, Kaige
    Liu, Weiqiang
    Wei, Qi
    Qiao, Fei
    Yang, Huazhong
    2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
  • [27] DPS: Dynamic Precision Scaling for Stochastic Computing -based Deep Neural Networks
    Sim, Hyeonuk
    Kenzhegulov, Saken
    Lee, Jongeun
    2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [28] Distributed Coordinated Attitude Tracking Control for Spacecraft Formation Based on Neural Networks
    Wang, Wenjia
    Li, Chuanjiang
    Sun, Yanchao
    Jiang, Boyan
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 7190 - 7195
  • [29] Coordinated Insertion Control for Inclined Precision Assembly
    Xing, Dengpeng
    Liu, Fangfang
    Qin, Fangbo
    Xu, De
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2016, 63 (05) : 2990 - 2999
  • [30] Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks
    Faraone, Julian
    Fraser, Nicholas
    Gambardella, Giulio
    Blott, Michaela
    Leong, Philip H. W.
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 393 - 404