MULTI-GPU DGEMM AND HIGH PERFORMANCE LINPACK ON HIGHLY ENERGY-EFFICIENT CLUSTERS

被引:4
|
作者
Rohr, David [1 ]
Bach, Matthias [1 ]
Kretz, Matthias [1 ]
Lindenstruth, Volker [1 ]
机构
[1] Goethe Univ Frankfurt, Frankfurt Inst Adv Studies, D-60438 Frankfurt, Germany
关键词
DGEMM; double-precision general matrix multiply; GPGPU; Green IT; heterogeneous (hybrid) systems; High Performance Linpack; HPL; multi-GPU; system architecture;
D O I
10.1109/MM.2011.66
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High Performance Linpack can maximize requirements throughout a computer system. An efficient multi-GPU double-precision general matrix multiply (DGEMM), together with adjustments to the HPL, is required to utilize a heterogeneous computer to its full extent. The authors present the resulting energy-efficiency measurements and suggest a cluster design that can utilize multiple GPUs. © 2011 IEEE.
引用
收藏
页码:18 / 26
页数:9
相关论文
共 50 条
  • [21] Multi-user predictive rendering on remote multi-GPU clusters
    Randrianandrasana, J.
    Chanonier, A.
    Deleau, H.
    Muller, T.
    Porral, P.
    Krajecki, M.
    Lucas, L.
    2018 IEEE FOURTH VR INTERNATIONAL WORKSHOP ON COLLABORATIVE VIRTUAL ENVIRONMENTS (3DCVE), 2018,
  • [22] Algorithmic skeletons for multi-core, multi-GPU systems and clusters
    Ernsting, Steffen
    Kuchen, Herbert
    International Journal of High Performance Computing and Networking, 2012, 7 (02) : 129 - 138
  • [23] High Performance Single and Multi-GPU Acceleration for Diffuse Optical Tomography
    Saikia, Manob Jyoti
    Kanhirodan, Rajan
    2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 1320 - 1323
  • [24] Distributed Join Algorithms on Multi-GPU Clusters with GPUDirect RDMA
    Guo, Chengxin
    Chen, Hong
    Zhang, Feng
    Li, Cuiping
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [25] Efficient Solving of Scan Primitive on Multi-GPU Systems
    Dieguez, Adrian P.
    Amor, Margarita
    Doallo, Ramon
    Nukada, Akira
    Matsuoka, Satoshi
    2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 794 - 803
  • [26] An Efficient Implementation of GPU Virtualization in High Performance Clusters
    Duato, Jose
    Igual, Francisco D.
    Mayo, Rafael
    Pena, Antonio J.
    Quintana-Orti, Enrique S.
    Silla, Federico
    EURO-PAR 2009 PARALLEL PROCESSING WORKSHOPS, 2010, 6043 : 385 - +
  • [27] Efficient breadth first search on multi-GPU systems
    Mastrostefano, Enrico
    Bernaschi, Massimo
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1292 - 1305
  • [28] Gossip: Efficient Communication Primitives for Multi-GPU Systems
    Kobus, Robin
    Juenger, Daniel
    Hundt, Christian
    Schmidt, Bertil
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [29] A Multi-GPU PCISPH Implementation with Efficient Memory Transfers
    Verma, Kevin
    Peng, Chong
    Szewc, Kamil
    Wille, Robert
    2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [30] G-DNA - a highly efficient multi-GPU/MPI tool for aligning nucleotide reads
    Frohmberg, W.
    Kierzynka, M.
    Blazewicz, J.
    Gawron, P.
    Wojciechowski, P.
    BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2013, 61 (04) : 989 - 992