Optimization power consumption model of reliability-aware GPU clusters

被引:0
|
作者
Haifeng Wang
Qingkui Chen
机构
[1] University of Shanghai for Science and Technology Shanghai,School of Management
[2] LinYi University,Information School
[3] University of Shanghai for Science and Technology,School of Optical
来源
关键词
Power consumption optimization; Reliability; GPU clusters; Model prediction control;
D O I
暂无
中图分类号
学科分类号
摘要
Power controlling on reliability-aware GPU clusters with dynamically variable voltage and speed is investigated as combinatorial optimization problem, namely the problem of minimizing task execution time with energy consumption constraint and the problem of minimizing energy consumption with system reliability constraint. The two problems have applied in general multiprocessor computing and real-time multiprocessing systems where energy consumption and system reliability both are important. These problems which emphasize the trade-off among performance, power and reliability have not been well studied before. In this research, a novel power control model is built based on Model Prediction Control theory. Maximum Entropy Method is used to determine partial ordering relation of control variable and to identify the quality of solutions. Our controller can cap the redundant energy consumption by dynamically transforming energy states of the nodes in GPU cluster. We compare our controller with the control scheme, which does not consider the system reliability. The experimental results demonstrate that the proposed controller is more reliable and valuable.
引用
下载
收藏
页码:153 / 174
页数:21
相关论文
共 50 条
  • [41] Reliability-Aware Scheduling on Heterogeneous Multicore Processors
    Naithani, Ajeya
    Eyerman, Stijn
    Eeckhout, Lieven
    2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 397 - 408
  • [42] Reliability-aware Virtual Data Center Embedding
    Zuo, Cheng
    Yu, Hongfang
    Anand, Vishal
    2014 6TH INTERNATIONAL WORKSHOP ON RELIABLE NETWORKS DESIGN AND MODELING (RNDM), 2014, : 151 - 157
  • [43] Reliability-aware core partitioning in chip multiprocessors
    Oz, Isil
    Topcuoglu, Haluk Rahmi
    Kandemir, Mahmut
    Tosun, Oguz
    JOURNAL OF SYSTEMS ARCHITECTURE, 2012, 58 (3-4) : 160 - 176
  • [44] Reliability-Aware Distributed Computing Scheduling Policy
    Abawajy, Jemal
    Hassan, Mohammad Mehedi
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2015, 2015, 9532 : 627 - 632
  • [45] A reliability-aware LDPC code decoding algorithm
    Alles, Matthias
    Brack, Torben
    Welm, Norbert
    2007 IEEE 65TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2007, : 1544 - 1548
  • [46] Joint Latency and Reliability-Aware Controller Placement
    Rasol, Kurdman Abdulrahman Rasol
    Domingo-Pascual, Jordi
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 197 - 202
  • [47] A Simulation-Based Optimization Approach for Reliability-Aware Service Composition in Edge Computing
    Huang, Jiwei
    Liang, Jingyu
    Ali, Sikandar
    IEEE ACCESS, 2020, 8 : 50355 - 50366
  • [48] Reliability-aware platform optimization for 3D chip multi-processors
    Kdouh, Wael
    El-Rewini, Hesham
    JOURNAL OF SUPERCOMPUTING, 2012, 60 (02): : 248 - 267
  • [49] Reliability-aware platform optimization for 3D chip multi-processors
    Wael Kdouh
    Hesham El-Rewini
    The Journal of Supercomputing, 2012, 60 : 248 - 267
  • [50] Reliability-aware optimization for DVS-enabled real-time embedded systems
    Dabiri, Foad
    Amini, Naivd
    Rofouei, Mahsan
    Sarrafzadeh, Majid
    ISQED 2008: PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, 2008, : 780 - 783