Optimization power consumption model of reliability-aware GPU clusters

被引:0
|
作者
Haifeng Wang
Qingkui Chen
机构
[1] University of Shanghai for Science and Technology Shanghai,School of Management
[2] LinYi University,Information School
[3] University of Shanghai for Science and Technology,School of Optical
来源
关键词
Power consumption optimization; Reliability; GPU clusters; Model prediction control;
D O I
暂无
中图分类号
学科分类号
摘要
Power controlling on reliability-aware GPU clusters with dynamically variable voltage and speed is investigated as combinatorial optimization problem, namely the problem of minimizing task execution time with energy consumption constraint and the problem of minimizing energy consumption with system reliability constraint. The two problems have applied in general multiprocessor computing and real-time multiprocessing systems where energy consumption and system reliability both are important. These problems which emphasize the trade-off among performance, power and reliability have not been well studied before. In this research, a novel power control model is built based on Model Prediction Control theory. Maximum Entropy Method is used to determine partial ordering relation of control variable and to identify the quality of solutions. Our controller can cap the redundant energy consumption by dynamically transforming energy states of the nodes in GPU cluster. We compare our controller with the control scheme, which does not consider the system reliability. The experimental results demonstrate that the proposed controller is more reliable and valuable.
引用
收藏
页码:153 / 174
页数:21
相关论文
共 50 条
  • [1] Optimization power consumption model of reliability-aware GPU clusters
    Wang, Haifeng
    Chen, Qingkui
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 67 (01): : 153 - 174
  • [2] Power consumption optimization control model of GPU clusters
    Information School, LinYi University, Linyi
    Shandong
    276005, China
    不详
    Shandong
    276005, China
    [J]. Tien Tzu Hsueh Pao, 10 (1904-1910): : 1904 - 1910
  • [3] Event driven power consumption optimization control model of GPU clusters
    Haifeng Wang
    Yunpeng Cao
    [J]. Cluster Computing, 2019, 22 : 965 - 979
  • [4] Event driven power consumption optimization control model of GPU clusters
    Wang, Haifeng
    Cao, Yunpeng
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (03): : 965 - 979
  • [5] A Reliability-aware Environment for Design Exploration for GPU Devices
    Sierra, Robert Limas
    Guerrero-Balaguera, Juan-David
    Condia, Josie E. Rodriguez
    Reorda, Matteo Sonza
    [J]. 2023 26TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, DDECS, 2023, : 169 - 174
  • [6] Reliability-aware performance model for optimal GPU-enabled cluster environment
    Supada Laosooksathit
    Raja Nassar
    Chokchai Leangsuksun
    Mihaela Paun
    [J]. The Journal of Supercomputing, 2014, 68 : 1630 - 1651
  • [7] Reliability-aware performance model for optimal GPU-enabled cluster environment
    Laosooksathit, Supada
    Nassar, Raja
    Leangsuksun, Chokchai
    Paun, Mihaela
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 68 (03): : 1630 - 1651
  • [8] Reliability-Aware Optimization of a Wideband Antenna
    Kouassi, Attibaud
    Nghia Nguyen-Trong
    Kaufmann, Thomas
    Lallechere, Sebastien
    Bonnet, Pierre
    Fumeaux, Christophe
    [J]. IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2016, 64 (02) : 450 - 460
  • [9] BRAVO: Balanced Reliability-Aware Voltage Optimization
    Swaminathan, Karthik
    Chandramoorthy, Nandhini
    Cher, Chen-Yong
    Bertran, Ramon
    Buyuktosunoglu, Alper
    Bose, Pradip
    [J]. 2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 97 - 108
  • [10] Reliability-Aware Runahead
    Naithani, Ajeya
    Eeckhout, Lieven
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 786 - 799