On Energy Nonproportionality of CPUs and GPUs

被引：1

作者：

Manumachu, Ravi Reddy ^{[1
]}

Lastovetsky, Alexey ^{[1
]}

机构：

[1] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland

来源：

2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022) | 2022年

基金：

爱尔兰科学基金会;

关键词：

Energy Proportionality; Multicore CPU; GPU; Bi-objective Optimization; Energy; Performance; 2D FFT; Matrix Multiplication; DATA-PARALLEL APPLICATIONS; BI-OBJECTIVE OPTIMIZATION; PERFORMANCE; MODEL;

D O I：

10.1109/IPDPSW55747.2022.00015

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Energy proportionality (EP) means designing a system that consumes energy proportional to the amount of work it performs. For an EP system, optimizing an application for performance also optimizes the application for total energy. Energy-proportional multicore CPUs and graphics processing units (GPUs) are fundamental to addressing the grand technological challenge of energy efficiency in Information and Communications Technology. In this work, we formally propose strong and weak notions of EP for modern microprocessors. Multicore CPUs were experimentally found to violate both strong and weak EP. This work presents the first attempt at a theoretical analysis to explain the behaviour. GPUs are carefully designed with on-chip resources primarily dedicated to achieving high arithmetic throughput rather than caching and flow control. Consequently, the mainstream view is that GPUs exhibit strong and weak EP. However, GPUs were experimentally found to violate strong EP. In this work, we experimentally study the weak EP of an Nvidia K40c GPU and an Nvidia P100 PCIe GPU using a specially designed matrix multiplication application. We show that both the GPUs also breach weak EP, which presents an opportunity for bi-objective optimization of the application for dynamic energy and performance. By analyzing the Pareto fronts of dynamic energy and performance for a wide range of workloads, the maximum dynamic energy savings are up to 18% while tolerating a performance degradation of 7% for Nvidia K40c GPU and (50%,11%) respectively, for Nvidia P100 PCIe GPU.

引用

页码：34 / 44

页数：11

共 50 条

[1] Evaluating the Energy Efficiency of Deep Convolutional Neural Networks on CPUs and GPUs
Li, Da
Chen, Xinbo
Becchi, Michela
Zong, Ziliang
[J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 477 - 484
[2] Evaluating Performance, Power and Energy of Deep Neural Networks on CPUs and GPUs
Sun, Yuyang
Ou, Zhixin
Chen, Juan
Qi, Xinxin
Guo, Yifei
Cai, Shunzhe
Yan, Xiaoming
[J]. THEORETICAL COMPUTER SCIENCE, NCTCS 2021, 2021, 1494 : 196 - 221
[3] Magnus integrators on multicore CPUs and GPUs
Auer, N.
Einkemmer, L.
Kandolf, P.
Ostermann, A.
[J]. COMPUTER PHYSICS COMMUNICATIONS, 2018, 228 : 115 - 122
[4] CPUS, GPUS, AND HYBRID COMPUTING Introduction
Brooks, David
[J]. IEEE MICRO, 2011, 31 (05) : 4 - 6
[5] CPUs and GPUs: Who Owns the Future?
Altman, Erik R.
[J]. IEEE MICRO, 2011, 31 (05) : 2 - 3
[6] Parallel cube computation on modern CPUs and GPUs
Zhou, Guoliang
Chen, Hong
[J]. JOURNAL OF SUPERCOMPUTING, 2012, 61 (03): : 394 - 417
[7] An evaluation of analytical queries on CPUs and coupled GPUs
Luan, Hua
Chang, Lei
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (05):
[8] Investigating SRAM PUFs in large CPUs and GPUs
Van Aubel, Pol
Bernstein, Daniel J.
Niederhagen, Ruben
[J]. SECURITY, PRIVACY, AND APPLIED CRYPTOGRAPHY ENGINEERING (SPACE 2015), 2015, 9354 : 228 - 247
[9] Evaluating Gather and Scatter Performance on CPUs and GPUs
Lavin, Patrick
Young, Jeffrey
Vuduc, Richard
Riedy, Jason
Vose, Aaron
Ernst, Daniel
[J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, MEMSYS 2020, 2020, : 209 - 222
[10] Parallel cube computation on modern CPUs and GPUs
Guoliang Zhou
Hong Chen
[J]. The Journal of Supercomputing, 2012, 61 : 394 - 417

← 1 2 3 4 5 →