Precision and Performance-Aware Voltage Scaling in DNN Accelerators

被引：0

作者：

Rathore, Mallika ^{[1
]}

Milder, Peter ^{[1
]}

Salman, Emre ^{[1
]}

机构：

[1] SUNY Stony Brook, Stony Brook, NY 11794 USA

来源：

PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2023, GLSVLSI 2023 | 2023年

关键词：

DNN accelerator; energy efficiency; voltage scaling; ERROR-DETECTION;

D O I：

10.1145/3583781.3590202

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A methodology is proposed to enhance the energy efficiency of systolic array based deep neural network (DNN) accelerators by enabling precision- and performance-aware voltage scaling. The proposed framework consists of three primary steps. In the first step, the voltage-dependent timing error probability for each output bit within the processing elements is analytically estimated. Next, these timing errors are injected into DNN models, helping us understand how inference accuracy is affected by lower operating voltages. In the last step, we apply error detection and correction to only select bits within the network, thereby improving inference accuracy while minimizing circuit overhead. For a 256x256 array operating at 0.7GHz and evaluating MobileNetV2 on ImageNet, we can reduce the nominal supply voltage from 0.9V to 0.5V with negligible (0.001%) latency overhead. This reduction in supply voltage reduces the inference energy by 79.4% while degrading inference accuracy by only 0.29%.

引用

页码：237 / 242

页数：6

共 50 条

[1] A Precision-Aware Neuron Engine for DNN Accelerators
Vishwakarma S.
Raut G.
Jaiswal S.
Vishvakarma S.K.
Ghai D.
[J]. SN Computer Science, 5 (5)
[2] Performance-Aware Design of Approximate Integrated MAC Factored Systolic Array Accelerators
Devi, Dantu Nandini
Kumar, Gandi Ajay
Gowda, Bindu G.
Rao, Madhav
[J]. 2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
[3] Performance-Aware Multicore Programming
Lo, Chia-Tien Dan
[J]. PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 126 - 131
[4] Targeting DNN Inference Via Efficient Utilization of Heterogeneous Precision DNN Accelerators
Spantidi, Ourania
Zervakis, Georgios
Alsalamin, Sami
Roman-Ballesteros, Isai
Henkel, Joerg
Amrouch, Hussam
Anagnostopoulos, Iraklis
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (01) : 112 - 125
[5] Thermal-Aware Design for Approximate DNN Accelerators
Zervakis, Georgios
Anagnostopoulos, Iraklis
Salamin, Sami
Spantidi, Ourania
Roman-Ballesteros, Isai
Henkel, Joerg
Amrouch, Hussam
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (10) : 2687 - 2697
[6] Performance-aware power management in embedded controllers with multiple-voltage processors
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
不详
不详
不详
[J]. Inf. Technol. J, 2008, 6 (942-947): : 942 - 947
[7] Performance-Aware Energy-Efficient GPU Frequency Selection using DNN-based Models
Ali, Ghazanfar
Side, Mert
Bhalachandra, Sridutt
Wright, Nicholas J.
Chen, Yong
[J]. PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 433 - 442
[8] Performance-aware load balancing for multiclusters
He, Ligang
Jarvis, Stephen A.
Bacigalupo, David
Spooner, Daniel P.
Nudd, Graham R.
[J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3358 : 635 - 647
[9] Performance-aware load balancing for multiclusters
He, LG
Jarvis, SA
Bacigalupo, D
Spooner, DP
Nudd, GR
[J]. PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 635 - 647
[10] Energy- and performance-aware incremental mapping for networks on chip with multiple voltage levels
Chou, Chen-Ling
Ogras, Umit Y.
Marculescu, Radu
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2008, 27 (10) : 1866 - 1879

← 1 2 3 4 5 →