MULTI-GPU DGEMM AND HIGH PERFORMANCE LINPACK ON HIGHLY ENERGY-EFFICIENT CLUSTERS

被引：4

作者：

Rohr, David ^{[1
]}

Bach, Matthias ^{[1
]}

Kretz, Matthias ^{[1
]}

Lindenstruth, Volker ^{[1
]}

机构：

[1] Goethe Univ Frankfurt, Frankfurt Inst Adv Studies, D-60438 Frankfurt, Germany

来源：

IEEE MICRO | 2011年 / 31卷 / 05期

关键词：

DGEMM; double-precision general matrix multiply; GPGPU; Green IT; heterogeneous (hybrid) systems; High Performance Linpack; HPL; multi-GPU; system architecture;

D O I：

10.1109/MM.2011.66

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

High Performance Linpack can maximize requirements throughout a computer system. An efficient multi-GPU double-precision general matrix multiply (DGEMM), together with adjustments to the HPL, is required to utilize a heterogeneous computer to its full extent. The authors present the resulting energy-efficiency measurements and suggest a cluster design that can utilize multiple GPUs. © 2011 IEEE.

引用

页码：18 / 26

页数：9

共 50 条

[21] Multi-user predictive rendering on remote multi-GPU clusters
Randrianandrasana, J.
Chanonier, A.
Deleau, H.
Muller, T.
Porral, P.
Krajecki, M.
Lucas, L.
2018 IEEE FOURTH VR INTERNATIONAL WORKSHOP ON COLLABORATIVE VIRTUAL ENVIRONMENTS (3DCVE), 2018,
[22] Algorithmic skeletons for multi-core, multi-GPU systems and clusters
Ernsting, Steffen
Kuchen, Herbert
International Journal of High Performance Computing and Networking, 2012, 7 (02) : 129 - 138
[23] High Performance Single and Multi-GPU Acceleration for Diffuse Optical Tomography
Saikia, Manob Jyoti
Kanhirodan, Rajan
2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 1320 - 1323
[24] Distributed Join Algorithms on Multi-GPU Clusters with GPUDirect RDMA
Guo, Chengxin
Chen, Hong
Zhang, Feng
Li, Cuiping
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[25] Efficient Solving of Scan Primitive on Multi-GPU Systems
Dieguez, Adrian P.
Amor, Margarita
Doallo, Ramon
Nukada, Akira
Matsuoka, Satoshi
2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 794 - 803
[26] An Efficient Implementation of GPU Virtualization in High Performance Clusters
Duato, Jose
Igual, Francisco D.
Mayo, Rafael
Pena, Antonio J.
Quintana-Orti, Enrique S.
Silla, Federico
EURO-PAR 2009 PARALLEL PROCESSING WORKSHOPS, 2010, 6043 : 385 - +
[27] Efficient breadth first search on multi-GPU systems
Mastrostefano, Enrico
Bernaschi, Massimo
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1292 - 1305
[28] Gossip: Efficient Communication Primitives for Multi-GPU Systems
Kobus, Robin
Juenger, Daniel
Hundt, Christian
Schmidt, Bertil
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[29] A Multi-GPU PCISPH Implementation with Efficient Memory Transfers
Verma, Kevin
Peng, Chong
Szewc, Kamil
Wille, Robert
2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
[30] G-DNA - a highly efficient multi-GPU/MPI tool for aligning nucleotide reads
Frohmberg, W.
Kierzynka, M.
Blazewicz, J.
Gawron, P.
Wojciechowski, P.
BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2013, 61 (04) : 989 - 992

← 1 2 3 4 5 →