A detailed performance analysis of the interpolation supplemented lattice Boltzmann method on the Cray T3E and Cray X1

被引：7

作者：

Sunder, C. Shyam ^{[1
]}

Baskar, G.

Babu, V.

Strenski, David

机构：

[1] Indian Inst Technol, Dept Mech Engn, TDCE, Madras 600036, Tamil Nadu, India

[2] Cornell Univ, Sibley Sch Mech & Aerosp Engn, Mat Proc Design & Control Lab, Ithaca, NY 14853 USA

[3] ETH, Inst Energietech, CH-8092 Zurich, Switzerland

[4] Cray Inc, Seattle, WA 98104 USA

来源：

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS | 2006年 / 20卷 / 04期

关键词：

shared memory; multiprocessors; parallel computing; SHMEM; MPI;

D O I：

10.1177/1094342006064572

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the present implementation of the ISLB method is that it is. able to achieve a sustained speed of 4.2 Tflop/s while using 504 processors on a Cray X1. The code is shown to achieve super-linear speedups on the Cray T3E-900. It is shown through detailed profiling that the computation and the communication scale well on the Cray X1, although the overall speedup is adversely affected by the cost of barrier synchronization.

引用

页码：557 / 570

页数：14

共 50 条

[31] High performance computing on the Cray T3E and IBM SP2 systems with the parallel version of GAUSSIAN 94
Gorb, L
Yanov, I
Leszczynski, J
PARALLEL COMPUTING, 2000, 26 (7-8) : 1043 - 1060
[32] 3-D large-scale wave propagation modeling by spectral element method on Cray T3E multiprocessor
Seriani, G
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 1998, 164 (1-2) : 235 - 247
[33] Cluster computing vs. Cray T3E - A case study from numerical field theory
Arnold, G
Eicker, N
Lippert, T
Schilling, K
NINTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2001, : 475 - 479
[34] Computations of three-dimensional compressible Rayleigh-Taylor instability on SGI/Cray T3E
Deane, A
PARALLEL COMPUTATIONAL FLUID DYNAMICS: TOWARDS TERAFLOPS, OPTIMIZATION, AND NOVEL FORMULATIONS, 2000, : 189 - 198
[35] A scalable HPF implementation of a finite-volume computational electromagnetics application on a CRAY T3E parallel systemt
Pan, Y
Shang, JJS
Guo, M
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2003, 15 (06): : 607 - 621
[36] A scalable molecular-dynamics algorithm suite for materials simulations: design-space diagram on 1024 Cray T3E processors
Shimojo, F
Campbell, TJ
Kalia, RK
Nakano, A
Vashishta, P
Ogata, S
Tsuruta, K
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2000, 17 (03): : 279 - 291
[37] Simulation of point defect clustering in Cz-silicon wafers on the Cray T3E scalable parallel computer: Application to oxygen precipitation
Karoui, FS
Karoui, A
Rozgonyi, GA
2000 INTERNATIONAL CONFERENCE ON MODELING AND SIMULATION OF MICROSYSTEMS, TECHNICAL PROCEEDINGS, 2000, : 98 - 101
[38] Performance analysis of Cray T3D and connection machine CM-5: A comparison
Marenzoni, P
HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 110 - 117
[39] Performance analysis and optimization of a parallel carbon molecular dynamic code on a gray T3E
Horoi, M
Enbody, RJ
1998 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - PROCEEDINGS, 1998, : 62 - 69
[40] The performance and scalability of SHMEM and MPI-2 one-sided routines on a SGI Origin 2000 and a Cray T3E-600
Luecke, GR
Spanoyannis, S
Kraeva, M
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2004, 16 (10): : 1037 - 1060

← 1 2 3 4 5 →