Coordinated DVFS and Precision Control for Deep Neural Networks

被引:18
|
作者
Nabavinejad, Seyed Morteza [1 ]
Hafez-Kolahi, Hassan [2 ]
Reda, Sherief [3 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Brown Univ, Sch Engn, Providence, RI 02912 USA
基金
美国国家科学基金会;
关键词
Graphics processing units; Power demand; Time factors; Runtime; Time-frequency analysis; Servers; Neural networks; Deep neural network; hardware accelerator; power; accuracy; response time;
D O I
10.1109/LCA.2019.2942020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Traditionally, DVFS has been the main mechanism to trade-off performance and power. We observe that Deep Neural Network (DNN) applications offer the possibility to trade-off performance, power, and accuracy using both DVFS and numerical precision levels. Our proposed approach, Power-Inference accuracy Trading (PIT), monitors the servers load, and accordingly adjusts the precision of the DNN model and the DVFS setting of GPU to trade-off the accuracy and power consumption with response time. At high loads and tight request arrivals, PIT leverages INT8-precision instructions of GPU to dynamically change the precision of deployed DNN models and boosts GPU frequency to execute the requests faster at the expense of accuracy reduction and high power consumption. However, when the requests arrival rate is relaxed and there is slack time for requests, PIT deploys high precision version of models to improve the accuracy and reduces GPU frequency to decrease power consumption. We implement and deploy PIT on a state-of-the-art server equipped with a Tesla P40 GPU. Experimental results demonstrate that depending on the load, PIT can improve response time up to 11 percent compared to a job scheduler that uses only FP32 precision. It also improves the energy consumption by up to 28 percent, while achieving around 99.5 percent accuracy of sole FP32-precision.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [1] Any-Precision Deep Neural Networks
    Yu, Haichao
    Li, Haoxiang
    Shi, Humphrey
    Huang, Thomas S.
    Hua, Gang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10763 - 10771
  • [2] A Variable Precision Approach for Deep Neural Networks
    Xuan-Tuyen Tran
    Duy-Anh Nguyen
    Duy-Hieu Bui
    Xuan-Tu Tran
    2019 12TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC 2019), 2019, : 313 - 318
  • [3] Coordinated Wide-Area Damping Control Using Deep Neural Networks and Reinforcement Learning
    Gupta, Pooja
    Pal, Anamitra
    Vittal, Vijay
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2022, 37 (01) : 365 - 376
  • [4] Analytical Guarantees on Numerical Precision of Deep Neural Networks
    Sakr, Charbel
    Kim, Yongjune
    Shanbhag, Naresh
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [5] Structured Dynamic Precision for Deep Neural Networks Quantization
    Huang, Kai
    Li, Bowen
    Xiong, Dongliang
    Jiang, Haitian
    Jiang, Xiaowen
    Yan, Xiaolang
    Claesen, Luc
    Liu, Dehong
    Chen, Junjian
    Liu, Zhili
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (01)
  • [6] Guest editorial: Deep neural networks for precision medicine
    Wu, Fang-Xiang
    Li, Min
    Kurgan, Lukasz
    Rueda, Luis
    NEUROCOMPUTING, 2022, 469 : 330 - 331
  • [7] Proteus: Exploiting precision variability in deep neural networks
    Judd, Patrick
    Albericio, Jorge
    Hetherington, Tayler
    Aamodt, Tor
    Jerger, Natalie Enright
    Urtasun, Raquel
    Moshovos, Andreas
    PARALLEL COMPUTING, 2018, 73 : 40 - 51
  • [8] Better schedules for low precision training of deep neural networks
    Cameron R. Wolfe
    Anastasios Kyrillidis
    Machine Learning, 2024, 113 : 3569 - 3587
  • [9] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [10] Better schedules for low precision training of deep neural networks
    Wolfe, Cameron R.
    Kyrillidis, Anastasios
    MACHINE LEARNING, 2024, 113 (06) : 3569 - 3587