Energy-efficient Stencil Computations on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication

被引:1
|
作者
Oden, Lena [1 ]
Klenk, Benjamin [2 ]
Froening, Holger [2 ]
机构
[1] Fraunhofer Inst Ind Math, Competence Ctr High Perfomance Comp, Kaiserslautern, Germany
[2] Heidelberg Univ, Inst Comp Engn, Heidelberg, Germany
关键词
GPUs; Energy Efficient; Dynamic Parallelism; Communication; Data Transfer; Infiniband;
D O I
10.1109/E2SC.2014.14
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
GPUs are widely used in high performance computing, due to their high computational power and high performance per Watt. Still, one of the main bottlenecks of GPU-accelerated cluster computing is the data transfer between distributed GPUs. This not only affects performance, but also power consumption. The most common way to utilize a GPU cluster is a hybrid model, in which the GPU is used to accelerate the computation while the CPU is responsible for the communication. This approach always requires a dedicated CPU thread, which consumes additional CPU cycles and therefore increases the power consumption of the complete application. In recent work we have shown that the GPU is able to control the communication independently of the CPU. Still, there are several problems with GPU-controlled communication. The main problem is intra-GPU synchronization, since GPU blocks are non-preemptive. Therefore, the use of communication requests within a GPU can easily result in a deadlock. In this work we show how Dynamic Parallelism solves this problem. GPU-controlled communication in combination with Dynamic Parallelism allows keeping the control flow of multi-GPU applications on the GPU and bypassing the CPU completely. Although the performance of applications using GPU-controlled communication is still slightly worse than the performance of hybrid applications, we will show that performance per Watt increases by up to 10% while still using commodity hardware.
引用
收藏
页码:31 / 40
页数:10
相关论文
共 50 条
  • [41] An energy-efficient wireless communication scheme using Quint Fibonacci number system
    Bhattacharya, Ansuman
    Majumder, Pratham
    Sinha, Koushik
    Sinha, Bhabani P.
    Kavitha, K. V. N.
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2016, 16 (02) : 140 - 161
  • [42] Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
    Krzywaniak, Adam
    Czarnul, Pawel
    Proficz, Jerzy
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 145 : 396 - 414
  • [43] Energy-efficient routing protocol depending on dynamic message communication over wireless sensor network
    Lee, KwangKyum
    Shin, Yongtae
    Khil, Ara
    [J]. FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2006 WORKSHOPS, PROCEEDINGS, 2006, 4331 : 1025 - +
  • [44] Dynamic Reduced-Round Cryptography for Energy-Efficient Wireless Communication of Smart IoT Devices
    Lardier, William
    Varo, Quentin
    Yan, Jun
    [J]. ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
  • [45] Distributed loss-compensation techniques for energy-efficient low-latency on-chip communication
    Jose, Anup P.
    Shepard, Kenneth L.
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2007, 42 (06) : 1415 - 1424
  • [46] Dynamic multipath routing for energy-efficient and reliable communication in 6G networks with MIMO
    Annadurai, C.
    Nelson, I.
    Nirmala Devi, K.
    Thavasi Raja, G.
    [J]. INTERNET TECHNOLOGY LETTERS, 2024,
  • [47] Soft Robot Actuators using Energy-Efficient Valves Controlled by Electropermanent Magnets
    Marchese, Andrew D.
    Onal, Cagdas D.
    Rus, Daniela
    [J]. 2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2011, : 756 - 761
  • [48] Real-Time and Energy-Efficient Inference at GPU-Based Network Edge using PON
    Onodera, Yukito
    Inoue, Yoshiaki
    Hisano, Daisuke
    Nakayama, Yu
    [J]. 2021 IEEE 18TH ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2021,
  • [49] Performance-Aware Energy-Efficient GPU Frequency Selection using DNN-based Models
    Ali, Ghazanfar
    Side, Mert
    Bhalachandra, Sridutt
    Wright, Nicholas J.
    Chen, Yong
    [J]. PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 433 - 442
  • [50] mHealthMon: Toward Energy-Efficient and Distributed Mobile Health Monitoring Using Parallel Offloading
    Ahnn, Jong Hoon
    Potkonjak, Miodrag
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2013, 37 (05)