Coordinated DVFS and Precision Control for Deep Neural Networks

被引:18
|
作者
Nabavinejad, Seyed Morteza [1 ]
Hafez-Kolahi, Hassan [2 ]
Reda, Sherief [3 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Brown Univ, Sch Engn, Providence, RI 02912 USA
基金
美国国家科学基金会;
关键词
Graphics processing units; Power demand; Time factors; Runtime; Time-frequency analysis; Servers; Neural networks; Deep neural network; hardware accelerator; power; accuracy; response time;
D O I
10.1109/LCA.2019.2942020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Traditionally, DVFS has been the main mechanism to trade-off performance and power. We observe that Deep Neural Network (DNN) applications offer the possibility to trade-off performance, power, and accuracy using both DVFS and numerical precision levels. Our proposed approach, Power-Inference accuracy Trading (PIT), monitors the servers load, and accordingly adjusts the precision of the DNN model and the DVFS setting of GPU to trade-off the accuracy and power consumption with response time. At high loads and tight request arrivals, PIT leverages INT8-precision instructions of GPU to dynamically change the precision of deployed DNN models and boosts GPU frequency to execute the requests faster at the expense of accuracy reduction and high power consumption. However, when the requests arrival rate is relaxed and there is slack time for requests, PIT deploys high precision version of models to improve the accuracy and reduces GPU frequency to decrease power consumption. We implement and deploy PIT on a state-of-the-art server equipped with a Tesla P40 GPU. Experimental results demonstrate that depending on the load, PIT can improve response time up to 11 percent compared to a job scheduler that uses only FP32 precision. It also improves the energy consumption by up to 28 percent, while achieving around 99.5 percent accuracy of sole FP32-precision.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [41] Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks
    Boo, Yoonho
    Shin, Sungho
    Choi, Jungwook
    Sung, Wonyong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6794 - 6802
  • [42] Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification
    Li, Jianlin
    Liu, Jiangchao
    Yang, Pengfei
    Chen, Liqian
    Huang, Xiaowei
    Zhang, Lijun
    STATIC ANALYSIS (SAS 2019), 2019, 11822 : 296 - 319
  • [43] Training Deep Neural Networks in Low-Precision with High Accuracy using FPGAs
    Fox, Sean
    Faraone, Julian
    Boland, David
    Vissers, Kees
    Leong, Philip H. W.
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 1 - 9
  • [44] AN ANALYTICAL METHOD TO DETERMINE MINIMUM PER-LAYER PRECISION OF DEEP NEURAL NETWORKS
    Sakr, Charbel
    Shanbhag, Naresh
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 1090 - 1094
  • [45] Ultra-Low Precision 4-bit Training of Deep Neural Networks
    Sun, Xiao
    Wang, Naigang
    Chen, Chia-Yu
    Ni, Jia-Min
    Agrawal, Ankur
    Cui, Xiaodong
    Venkataramani, Swagath
    El Maghraoui, Kaoutar
    Srinivasan, Vijayalakshmi
    Gopalakrishnan, Kailash
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [46] Mixed-precision architecture based on computational memory for training deep neural networks
    Nandakumar, S. R.
    Le Gallo, Manuel
    Boybat, Irem
    Rajendran, Bipin
    Sebastian, Abu
    Eleftheriou, Evangelos
    2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [47] Mixed Precision Weight Networks: Training Neural Networks with Varied Precision Weights
    Fuengfusin, Ninnart
    Tamukoh, Hakaru
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT II, 2018, 11302 : 614 - 623
  • [48] Coordinated Batching and DVFS for DNN Inference on GPU Accelerators
    Nabavinejad, Seyed Morteza
    Reda, Sherief
    Ebrahimi, Masoumeh
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (10) : 2496 - 2508
  • [49] Application of neural networks: Precision farming
    Drummond, S
    Joshi, A
    Sudduth, KA
    IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 211 - 215
  • [50] Energy-Efficient Power Control in Wireless Networks With Spatial Deep Neural Networks
    Zhang, Ticao
    Mao, Shiwen
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2020, 6 (01) : 111 - 124