A detailed performance analysis of the interpolation supplemented lattice Boltzmann method on the Cray T3E and Cray X1

被引:7
|
作者
Sunder, C. Shyam [1 ]
Baskar, G.
Babu, V.
Strenski, David
机构
[1] Indian Inst Technol, Dept Mech Engn, TDCE, Madras 600036, Tamil Nadu, India
[2] Cornell Univ, Sibley Sch Mech & Aerosp Engn, Mat Proc Design & Control Lab, Ithaca, NY 14853 USA
[3] ETH, Inst Energietech, CH-8092 Zurich, Switzerland
[4] Cray Inc, Seattle, WA 98104 USA
关键词
shared memory; multiprocessors; parallel computing; SHMEM; MPI;
D O I
10.1177/1094342006064572
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the present implementation of the ISLB method is that it is. able to achieve a sustained speed of 4.2 Tflop/s while using 504 processors on a Cray X1. The code is shown to achieve super-linear speedups on the Cray T3E-900. It is shown through detailed profiling that the computation and the communication scale well on the Cray X1, although the overall speedup is adversely affected by the cost of barrier synchronization.
引用
收藏
页码:557 / 570
页数:14
相关论文
共 50 条
  • [31] High performance computing on the Cray T3E and IBM SP2 systems with the parallel version of GAUSSIAN 94
    Gorb, L
    Yanov, I
    Leszczynski, J
    PARALLEL COMPUTING, 2000, 26 (7-8) : 1043 - 1060
  • [32] 3-D large-scale wave propagation modeling by spectral element method on Cray T3E multiprocessor
    Seriani, G
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 1998, 164 (1-2) : 235 - 247
  • [33] Cluster computing vs. Cray T3E - A case study from numerical field theory
    Arnold, G
    Eicker, N
    Lippert, T
    Schilling, K
    NINTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2001, : 475 - 479
  • [34] Computations of three-dimensional compressible Rayleigh-Taylor instability on SGI/Cray T3E
    Deane, A
    PARALLEL COMPUTATIONAL FLUID DYNAMICS: TOWARDS TERAFLOPS, OPTIMIZATION, AND NOVEL FORMULATIONS, 2000, : 189 - 198
  • [35] A scalable HPF implementation of a finite-volume computational electromagnetics application on a CRAY T3E parallel systemt
    Pan, Y
    Shang, JJS
    Guo, M
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2003, 15 (06): : 607 - 621
  • [36] A scalable molecular-dynamics algorithm suite for materials simulations: design-space diagram on 1024 Cray T3E processors
    Shimojo, F
    Campbell, TJ
    Kalia, RK
    Nakano, A
    Vashishta, P
    Ogata, S
    Tsuruta, K
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2000, 17 (03): : 279 - 291
  • [37] Simulation of point defect clustering in Cz-silicon wafers on the Cray T3E scalable parallel computer: Application to oxygen precipitation
    Karoui, FS
    Karoui, A
    Rozgonyi, GA
    2000 INTERNATIONAL CONFERENCE ON MODELING AND SIMULATION OF MICROSYSTEMS, TECHNICAL PROCEEDINGS, 2000, : 98 - 101
  • [38] Performance analysis of Cray T3D and connection machine CM-5: A comparison
    Marenzoni, P
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 110 - 117
  • [39] Performance analysis and optimization of a parallel carbon molecular dynamic code on a gray T3E
    Horoi, M
    Enbody, RJ
    1998 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - PROCEEDINGS, 1998, : 62 - 69
  • [40] The performance and scalability of SHMEM and MPI-2 one-sided routines on a SGI Origin 2000 and a Cray T3E-600
    Luecke, GR
    Spanoyannis, S
    Kraeva, M
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2004, 16 (10): : 1037 - 1060