On Energy Nonproportionality of CPUs and GPUs

被引:1
|
作者
Manumachu, Ravi Reddy [1 ]
Lastovetsky, Alexey [1 ]
机构
[1] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Energy Proportionality; Multicore CPU; GPU; Bi-objective Optimization; Energy; Performance; 2D FFT; Matrix Multiplication; DATA-PARALLEL APPLICATIONS; BI-OBJECTIVE OPTIMIZATION; PERFORMANCE; MODEL;
D O I
10.1109/IPDPSW55747.2022.00015
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Energy proportionality (EP) means designing a system that consumes energy proportional to the amount of work it performs. For an EP system, optimizing an application for performance also optimizes the application for total energy. Energy-proportional multicore CPUs and graphics processing units (GPUs) are fundamental to addressing the grand technological challenge of energy efficiency in Information and Communications Technology. In this work, we formally propose strong and weak notions of EP for modern microprocessors. Multicore CPUs were experimentally found to violate both strong and weak EP. This work presents the first attempt at a theoretical analysis to explain the behaviour. GPUs are carefully designed with on-chip resources primarily dedicated to achieving high arithmetic throughput rather than caching and flow control. Consequently, the mainstream view is that GPUs exhibit strong and weak EP. However, GPUs were experimentally found to violate strong EP. In this work, we experimentally study the weak EP of an Nvidia K40c GPU and an Nvidia P100 PCIe GPU using a specially designed matrix multiplication application. We show that both the GPUs also breach weak EP, which presents an opportunity for bi-objective optimization of the application for dynamic energy and performance. By analyzing the Pareto fronts of dynamic energy and performance for a wide range of workloads, the maximum dynamic energy savings are up to 18% while tolerating a performance degradation of 7% for Nvidia K40c GPU and (50%,11%) respectively, for Nvidia P100 PCIe GPU.
引用
收藏
页码:34 / 44
页数:11
相关论文
共 50 条
  • [1] Evaluating the Energy Efficiency of Deep Convolutional Neural Networks on CPUs and GPUs
    Li, Da
    Chen, Xinbo
    Becchi, Michela
    Zong, Ziliang
    [J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 477 - 484
  • [2] Evaluating Performance, Power and Energy of Deep Neural Networks on CPUs and GPUs
    Sun, Yuyang
    Ou, Zhixin
    Chen, Juan
    Qi, Xinxin
    Guo, Yifei
    Cai, Shunzhe
    Yan, Xiaoming
    [J]. THEORETICAL COMPUTER SCIENCE, NCTCS 2021, 2021, 1494 : 196 - 221
  • [3] Magnus integrators on multicore CPUs and GPUs
    Auer, N.
    Einkemmer, L.
    Kandolf, P.
    Ostermann, A.
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2018, 228 : 115 - 122
  • [4] CPUS, GPUS, AND HYBRID COMPUTING Introduction
    Brooks, David
    [J]. IEEE MICRO, 2011, 31 (05) : 4 - 6
  • [5] CPUs and GPUs: Who Owns the Future?
    Altman, Erik R.
    [J]. IEEE MICRO, 2011, 31 (05) : 2 - 3
  • [6] Parallel cube computation on modern CPUs and GPUs
    Zhou, Guoliang
    Chen, Hong
    [J]. JOURNAL OF SUPERCOMPUTING, 2012, 61 (03): : 394 - 417
  • [7] An evaluation of analytical queries on CPUs and coupled GPUs
    Luan, Hua
    Chang, Lei
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (05):
  • [8] Investigating SRAM PUFs in large CPUs and GPUs
    Van Aubel, Pol
    Bernstein, Daniel J.
    Niederhagen, Ruben
    [J]. SECURITY, PRIVACY, AND APPLIED CRYPTOGRAPHY ENGINEERING (SPACE 2015), 2015, 9354 : 228 - 247
  • [9] Evaluating Gather and Scatter Performance on CPUs and GPUs
    Lavin, Patrick
    Young, Jeffrey
    Vuduc, Richard
    Riedy, Jason
    Vose, Aaron
    Ernst, Daniel
    [J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, MEMSYS 2020, 2020, : 209 - 222
  • [10] Parallel cube computation on modern CPUs and GPUs
    Guoliang Zhou
    Hong Chen
    [J]. The Journal of Supercomputing, 2012, 61 : 394 - 417