Latency-aware DVFS for efficient power state transitions on many-core architectures

被引:0
|
作者
Zhiquan Lai
King Tin Lam
Cho-Li Wang
Jinshu Su
机构
[1] National University of Defense Technology,National Key Laboratory of Parallel and Distributed Processing, College of Computer
[2] The University of Hong Kong,Department of Computer Science
来源
关键词
Power management; Dynamic voltage and frequency scaling; Profiling; Shared virtual memory; Many-core processors; The single-chip cloud computer;
D O I
暂无
中图分类号
学科分类号
摘要
Energy efficiency is quickly becoming a first-class design constraint in high-performance computing (HPC). We need more efficient power management solutions to save energy costs and carbon footprint of HPC systems. Dynamic voltage and frequency scaling (DVFS) is a commonly used power management technique for making a trade-off between power consumption and system performance according to the time-varying program behavior. However, prior work on DVFS seldom takes into account the voltage and frequency scaling latencies, which we found to be a crucial factor determining the efficiency of the power management scheme. Frequent power state transitions without latency awareness can make a real impact on the execution performance of applications. The design of multiple voltage domains in some many-core architectures has made the effect of DVFS latencies even more significant. These concerns lead us to propose a new latency-aware DVFS scheme to adjust the optimal power state more accurately. Our main idea is to analyze the latency characteristics in depth and design a novel profile-guided DVFS solution which exploits the varying execution patterns of the parallel program to avoid excessive power state transitions. We implement the solution into a power management library for use by shared-memory parallel applications. Experimental evaluation on the Intel SCC many-core platform shows significant improvement in power efficiency after using our scheme. Compared with a latency-unaware approach, we achieve 24.0 % extra energy saving, 31.3 % more reduction in the energy–delay product and 15.2 % less overhead in execution time in the average case for various benchmarks. Our algorithm is also proved to outperform a prior DVFS approach attempted to mitigate the latency effects.
引用
收藏
页码:2720 / 2747
页数:27
相关论文
共 50 条
  • [21] THERMAL-AWARE POWER MIGRATION IN MANY-CORE PROCESSORS
    Raghu, Avinash
    Karajgikar, Saket
    Agonafer, Dereje
    Sammakia, Bahgat
    [J]. PROCEEDINGS OF THE ASME INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION 2010, VOL 4, 2012, : 397 - 404
  • [22] PoweRock: Power Modeling and Flexible Dynamic Power Management for Many-Core Architectures
    Lai, Zhiquan
    Lam, King Tin
    Wang, Cho-Li
    Su, Jinshu
    [J]. IEEE SYSTEMS JOURNAL, 2017, 11 (02): : 600 - 612
  • [23] A Novel Compute-Efficient Tridiagonal Solver for Many-Core Architectures
    Liu, Kan
    Xue, Wei
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (01) : 195 - 206
  • [24] Energy Efficient Power Distribution on Many-Core SoC
    Shihab, Mustafa M.
    Agrawal, Vishwani D.
    [J]. 2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 488 - 493
  • [25] Techniques for Enabling Highly Efficient Message Passing on Many-Core Architectures
    Si, Min
    Balaji, Pavan
    Ishikawa, Yutaka
    [J]. 2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 697 - 700
  • [26] MANY-TASK COMPUTING ON MANY-CORE ARCHITECTURES
    Valero-Lara, Pedro
    Nookala, Poornima
    Pelayo, Fernando L.
    Jansson, Johan
    Dimitropoulos, Serapheim
    Raicu, Ioan
    [J]. SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2016, 17 (01): : 33 - 46
  • [27] An efficient implementation of kernel density estimation for multi-core and many-core architectures
    Lopez-Novoa, Unai
    Saenz, Jon
    Mendiburu, Alexander
    Miguel-Alonso, Jose
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2015, 29 (03): : 331 - 347
  • [28] Performance Evaluation of OpenFOAM on Many-Core Architectures
    Brzobohaty, Tomas
    Riha, Lubomir
    Karasek, Tomas
    Kozubek, Tomas
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
  • [29] Graph Reachability on Parallel Many-Core Architectures
    Quer, Stefano
    Calabrese, Andrea
    [J]. COMPUTATION, 2020, 8 (04) : 1 - 26
  • [30] A Compressive Sensing Algorithm for Many-Core Architectures
    Borghi, A.
    Darbon, J.
    Peyronnet, S.
    Chan, T. F.
    Osher, S.
    [J]. ADVANCES IN VISUAL COMPUTING, PT II, 2010, 6454 : 678 - 686