Precision and Performance-Aware Voltage Scaling in DNN Accelerators

被引:0
|
作者
Rathore, Mallika [1 ]
Milder, Peter [1 ]
Salman, Emre [1 ]
机构
[1] SUNY Stony Brook, Stony Brook, NY 11794 USA
关键词
DNN accelerator; energy efficiency; voltage scaling; ERROR-DETECTION;
D O I
10.1145/3583781.3590202
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A methodology is proposed to enhance the energy efficiency of systolic array based deep neural network (DNN) accelerators by enabling precision- and performance-aware voltage scaling. The proposed framework consists of three primary steps. In the first step, the voltage-dependent timing error probability for each output bit within the processing elements is analytically estimated. Next, these timing errors are injected into DNN models, helping us understand how inference accuracy is affected by lower operating voltages. In the last step, we apply error detection and correction to only select bits within the network, thereby improving inference accuracy while minimizing circuit overhead. For a 256x256 array operating at 0.7GHz and evaluating MobileNetV2 on ImageNet, we can reduce the nominal supply voltage from 0.9V to 0.5V with negligible (0.001%) latency overhead. This reduction in supply voltage reduces the inference energy by 79.4% while degrading inference accuracy by only 0.29%.
引用
收藏
页码:237 / 242
页数:6
相关论文
共 50 条
  • [1] A Precision-Aware Neuron Engine for DNN Accelerators
    Vishwakarma S.
    Raut G.
    Jaiswal S.
    Vishvakarma S.K.
    Ghai D.
    [J]. SN Computer Science, 5 (5)
  • [2] Performance-Aware Design of Approximate Integrated MAC Factored Systolic Array Accelerators
    Devi, Dantu Nandini
    Kumar, Gandi Ajay
    Gowda, Bindu G.
    Rao, Madhav
    [J]. 2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [3] Performance-Aware Multicore Programming
    Lo, Chia-Tien Dan
    [J]. PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 126 - 131
  • [4] Targeting DNN Inference Via Efficient Utilization of Heterogeneous Precision DNN Accelerators
    Spantidi, Ourania
    Zervakis, Georgios
    Alsalamin, Sami
    Roman-Ballesteros, Isai
    Henkel, Joerg
    Amrouch, Hussam
    Anagnostopoulos, Iraklis
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (01) : 112 - 125
  • [5] Thermal-Aware Design for Approximate DNN Accelerators
    Zervakis, Georgios
    Anagnostopoulos, Iraklis
    Salamin, Sami
    Spantidi, Ourania
    Roman-Ballesteros, Isai
    Henkel, Joerg
    Amrouch, Hussam
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (10) : 2687 - 2697
  • [6] Performance-aware power management in embedded controllers with multiple-voltage processors
    College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
    不详
    不详
    不详
    [J]. Inf. Technol. J, 2008, 6 (942-947): : 942 - 947
  • [7] Performance-Aware Energy-Efficient GPU Frequency Selection using DNN-based Models
    Ali, Ghazanfar
    Side, Mert
    Bhalachandra, Sridutt
    Wright, Nicholas J.
    Chen, Yong
    [J]. PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 433 - 442
  • [8] Performance-aware load balancing for multiclusters
    He, Ligang
    Jarvis, Stephen A.
    Bacigalupo, David
    Spooner, Daniel P.
    Nudd, Graham R.
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3358 : 635 - 647
  • [9] Performance-aware load balancing for multiclusters
    He, LG
    Jarvis, SA
    Bacigalupo, D
    Spooner, DP
    Nudd, GR
    [J]. PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 635 - 647
  • [10] Energy- and performance-aware incremental mapping for networks on chip with multiple voltage levels
    Chou, Chen-Ling
    Ogras, Umit Y.
    Marculescu, Radu
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2008, 27 (10) : 1866 - 1879