Exploring the performance of massively multithreaded architectures

被引：5

作者：

Bokhari, Shahid ^{[1
]}

Saltz, Joel ^{[2
]}

机构：

[1] Ohio State Univ, Dept Biomed Informat, Columbus, OH 43210 USA

[2] Emory Univ, Ctr Comprehens Informat, Atlanta, GA 30322 USA

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2010年 / 22卷 / 05期

基金：

美国国家科学基金会;

关键词：

Cray MTA; Cray XMT; IBM x3755; itanium; multicore; multithreading; opteron; parallel computing; parallel algorithms; SGI Altix; shared memory;

D O I：

10.1002/cpe.1484

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We present a new scheme for evaluating the performance of multithreaded computers and demonstrate its application to the Cray MTA-2 and XMT supercomputers. Our scheme is based on the concept of clock cycles per element, C, plotted against both problem size and the number of processors. This scheme clearly shows if an implementation has achieved its asymptotic efficiency and is more general than (but includes) the commonly used speedup metric. It permits the discovery of any imperfections in both the software as well as the hardware, and is expected to permit a unified comparison of many different parallel architectures. Measurements on a number of well-known parallel algorithms, ranging from matrix multiply to quicksort, are presented for the MTA-2 and XMT and highlight some interesting differences between these machines. The performance of sequence alignment using dynamic programming is evaluated on the MTA-2, XMT, IBM x3755 and SGI Altix 350 and provides a useful comparison of the capabilities of the Cray machines with more conventional shared memory architectures. Copyright (c) 2009 John Wiley & Sons, Ltd.

引用

页码：588 / 616

页数：29

共 50 条

[31] A multithreaded communication engine for multicore architectures
Trahay, Francois
Brunet, Elisabeth
Denis, Alexandre
Namyst, Raymond
[J]. 2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 190 - 196
[32] Parallel Inverse Kinematics for Multithreaded Architectures
Harish, Pawan
Mahmudi, Mentar
Le Callennec, Benoit
Boulic, Ronan
[J]. ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (02):
[33] New insights into modeling multithreaded architectures
Cerin, C
[J]. 1996 IEEE SECOND INTERNATIONAL CONFERENCE ON ALGORITHMS & ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP'96, PROCEEDINGS OF, 1996, : 502 - 508
[34] Can multithreaded programming save massively parallel computing?
Leiserson, CE
[J]. 10TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM - PROCEEDINGS OF IPPS '96, 1996, : 2 - 2
[35] SiMT-DSP: A Massively Multithreaded DSP Architecture
Perach, Ben
Weiss, Shlomo
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (08) : 1413 - 1426
[36] Single-ISA heterogeneous multi-core architectures for multithreaded workload performance
Kumar, R
Tullsen, DM
Ranganathan, P
Jouppi, NP
Farkas, KI
[J]. 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2004, : 64 - 75
[37] Evaluating architectures for multithreaded object request brokers
Schmidt, DC
[J]. COMMUNICATIONS OF THE ACM, 1998, 41 (10) : 54 - 60
[38] Toward scalable matrix multiply on multithreaded architectures
Marker, Brvan
Van Zee, Field G.
Goto, Kazushige
Quintana-Orti, Gregorio
de Geijn, Robert A. van
[J]. EURO-PAR 2007 PARALLEL PROCESSING, PROCEEDINGS, 2007, 4641 : 748 - 757
[39] Limitation of branch predictors: A case for multithreaded architectures
Golla, PN
Lin, EC
[J]. PROCEEDINGS IEEE SOUTHEASTCON '98: ENGINEERING FOR A NEW ERA, 1998, : 138 - 143
[40] Workshop on multithreaded architectures and applications-MTAAP
Derose, Luiz
[J]. Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013, 2013,

← 1 2 3 4 5 →