Scalability of parallel spatial direct numerical simulations on intel hypercube and IBM SP1 and SP2

被引：1

作者：

Joslin, Ronald D. ^{[1
]}

Hanebutte, Ulf R. ^{[1
]}

Zubair, Mohammad ^{[1
]}

机构：

[1] NASA Langley Research Cent, Hampton, United States

来源：

Journal of Scientific Computing | 1995年 / 10卷 / 02期

关键词：

Boundary layer flow - Calculations - Codes (symbols) - Computation theory - Computer architecture - Data structures - Estimation - Fast Fourier transforms - Laminar flow - Optimization - Parallel processing systems - Turbulent flow;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 Mflops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application.

引用

下载

页码：233 / 269

共 50 条

[1] THE COMMUNICATION SOFTWARE AND PARALLEL ENVIRONMENT OF THE IBM SP2
SNIR, M
HOCHSCHILD, P
FRYE, DD
GILDEA, KJ
IBM SYSTEMS JOURNAL, 1995, 34 (02) : 205 - 221
[2] Spherical functions on SP2 as a spherical homogeneous SP2 x (SP1)2-space
Hironaka, Y
JOURNAL OF NUMBER THEORY, 2005, 112 (02) : 238 - 286
[3] Benchmark evaluation of the IBM SP2 for parallel signal processing
Hwang, K
Xu, ZW
Arakawa, M
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (05) : 522 - 536
[4] Scalability analysis of large code using factorial designs on the IBM SP2
Alabdulkareem, M
Lakshmivarahan, S
Dhall, SK
INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 574 - 577
[5] STEREOSPECIFICITY OF SP1 AND SP2 SUBSTANCE-P RECEPTORS
PIERCEY, MF
DOBRYSCHREUR, PJK
MASIQUES, N
SCHROEDER, LA
LIFE SCIENCES, 1985, 36 (08) : 777 - 780
[6] 回忆过去 FERRARI MONZA SP1 &SP2 CONCEPT
汽车知识, 2018, (11) : 98 - 99
[7] Parallel sorting on the NEC Cenju-3 and IBM SP2
Sanders, Darren
Park, Yoonho
Govindan, Vasudha
IEEE, Los Alamitos, CA, United States
[8] Implementation of a parallel genetic algorithm for floorplan optimization on IBM SP2
Foo, HY
Song, JJ
Zhuang, WJ
Esbensen, H
Kuh, ES
HIGH PERFORMANCE COMPUTING ON THE INFORMATION SUPERHIGHWAY - HPC ASIA '97, PROCEEDINGS, 1997, : 456 - 459
[9] Parallel sorting on the NEC Cenju-3 and IBM SP2
Sanders, D
Park, Y
Govindan, V
HIGH PERFORMANCE COMPUTING ON THE INFORMATION SUPERHIGHWAY - HPC ASIA '97, PROCEEDINGS, 1997, : 214 - 219
[10] 法拉利Monza SP1和Monza SP2
汽车知识, 2018, (12) : 38 - 39

← 1 2 3 4 5 →