Architecture scalability of parallel vector computers with a shared memory

被引：2

作者：

Dekker, E ^{[1
]}

机构：

[1] Delft Univ Technol, Fac Informat Technol & Syst, NL-2628 CD Delft, Netherlands

来源：

IEEE TRANSACTIONS ON COMPUTERS | 1998年 / 47卷 / 05期

关键词：

architecture scalability; parallel vector computers; shared memory; sustainable peak performance; theoretical peak performance;

D O I：

10.1109/12.677257

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Based on a model of a parallel vector computer with a shared memory, its scalability properties are derived. The processor-memory interconnection network is assumed to be composed of crossbar switches of size b x b. This paper analyzes sustainable peak performance under optimal conditions, i.e., no memory bank conflicts, sufficient processor-memory bank pathways, and no interconnection network conflicts. It will be shown that, with fully vectorizable algorithms and no communication overhead, the sustainable peak performance does not scale up linearly with the number of processors p, If the interconnection network is unbuffered, the number of memory banks must increase at least with O(p log(b) p) to sustain peak performance. If the network is buffered, this bottleneck can be alleviated; however, the half performance vector length still increases with O(log(b) p). The paper confirms the validity of the model by examining the performance behavior of the LINPACK benchmark.

引用

页码：614 / 624

页数：11

共 50 条

[21] High performance semiconductor device simulation on shared memory parallel computers
Hahad, M
Hopper, P
SISPAD '96 - 1996 INTERNATIONAL CONFERENCE ON SIMULATION OF SEMICONDUCTOR PROCESSES AND DEVICES, 1996, : 137 - 138
[22] A parallel structured ecological model for high end shared memory computers
Wang, Dali
Berry, Michael W.
Gross, Louis J.
OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2008, 4315 : 107 - +
[23] Modelling a fast parallel thinning algorithm for shared memory SIMD computers
Mahapatra, Rabi
Pareek, Harish
Information Processing Letters, 1991, 40 (05): : 257 - 261
[24] WORK-OPTIMAL ASYNCHRONOUS ALGORITHMS FOR SHARED MEMORY PARALLEL COMPUTERS
MARTEL, C
PARK, A
SUBRAMONIAN, R
SIAM JOURNAL ON COMPUTING, 1992, 21 (06) : 1070 - 1099
[25] MODELING A FAST PARALLEL THINNING ALGORITHM FOR SHARED MEMORY SIMD COMPUTERS
MAHAPATRA, RN
PAREEK, H
INFORMATION PROCESSING LETTERS, 1991, 40 (05) : 257 - 261
[26] Bipartite matching heuristics with quality guarantees on shared memory parallel computers
Dufosse, Fanny
Kaya, Kamer
Ucar, Bora
2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
[27] Optimizing load balance and communication on parallel computers with distributed shared memory
Berrendorf, R
EURO-PAR '98 PARALLEL PROCESSING, 1998, 1470 : 299 - 306
[28] Parallel Implementation of a Watershed Algorithm on Shared Memory Multicore Architecture
Braham, Yosra
Akil, Mohamed
Bedoui, Mohamed Hedi
NINTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2016), 2017, 10341
[29] Accelerated packet placement architecture for parallel shared memory routers
Matthews, Brad
Elhanany, Itamar
Tabatabaee, Vahid
NETWORKING 2007: AD HOC AND SENSOR NETWORKS, WIRELESS NETWORKS, NEXT GENERATION INTERNET, PROCEEDINGS, 2007, 4479 : 797 - +
[30] A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers
Takahashi, D
APPLIED PARALLEL COMPUTING: ADVANCED SCIENTIFIC COMPUTING, 2002, 2367 : 380 - 389

← 1 2 3 4 5 →