Exploring the performance of massively multithreaded architectures

被引:5
|
作者
Bokhari, Shahid [1 ]
Saltz, Joel [2 ]
机构
[1] Ohio State Univ, Dept Biomed Informat, Columbus, OH 43210 USA
[2] Emory Univ, Ctr Comprehens Informat, Atlanta, GA 30322 USA
来源
基金
美国国家科学基金会;
关键词
Cray MTA; Cray XMT; IBM x3755; itanium; multicore; multithreading; opteron; parallel computing; parallel algorithms; SGI Altix; shared memory;
D O I
10.1002/cpe.1484
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a new scheme for evaluating the performance of multithreaded computers and demonstrate its application to the Cray MTA-2 and XMT supercomputers. Our scheme is based on the concept of clock cycles per element, C, plotted against both problem size and the number of processors. This scheme clearly shows if an implementation has achieved its asymptotic efficiency and is more general than (but includes) the commonly used speedup metric. It permits the discovery of any imperfections in both the software as well as the hardware, and is expected to permit a unified comparison of many different parallel architectures. Measurements on a number of well-known parallel algorithms, ranging from matrix multiply to quicksort, are presented for the MTA-2 and XMT and highlight some interesting differences between these machines. The performance of sequence alignment using dynamic programming is evaluated on the MTA-2, XMT, IBM x3755 and SGI Altix 350 and provides a useful comparison of the capabilities of the Cray machines with more conventional shared memory architectures. Copyright (c) 2009 John Wiley & Sons, Ltd.
引用
收藏
页码:588 / 616
页数:29
相关论文
共 50 条
  • [31] A multithreaded communication engine for multicore architectures
    Trahay, Francois
    Brunet, Elisabeth
    Denis, Alexandre
    Namyst, Raymond
    [J]. 2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 190 - 196
  • [32] Parallel Inverse Kinematics for Multithreaded Architectures
    Harish, Pawan
    Mahmudi, Mentar
    Le Callennec, Benoit
    Boulic, Ronan
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (02):
  • [33] New insights into modeling multithreaded architectures
    Cerin, C
    [J]. 1996 IEEE SECOND INTERNATIONAL CONFERENCE ON ALGORITHMS & ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP'96, PROCEEDINGS OF, 1996, : 502 - 508
  • [34] Can multithreaded programming save massively parallel computing?
    Leiserson, CE
    [J]. 10TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM - PROCEEDINGS OF IPPS '96, 1996, : 2 - 2
  • [35] SiMT-DSP: A Massively Multithreaded DSP Architecture
    Perach, Ben
    Weiss, Shlomo
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (08) : 1413 - 1426
  • [36] Single-ISA heterogeneous multi-core architectures for multithreaded workload performance
    Kumar, R
    Tullsen, DM
    Ranganathan, P
    Jouppi, NP
    Farkas, KI
    [J]. 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2004, : 64 - 75
  • [37] Evaluating architectures for multithreaded object request brokers
    Schmidt, DC
    [J]. COMMUNICATIONS OF THE ACM, 1998, 41 (10) : 54 - 60
  • [38] Toward scalable matrix multiply on multithreaded architectures
    Marker, Brvan
    Van Zee, Field G.
    Goto, Kazushige
    Quintana-Orti, Gregorio
    de Geijn, Robert A. van
    [J]. EURO-PAR 2007 PARALLEL PROCESSING, PROCEEDINGS, 2007, 4641 : 748 - 757
  • [39] Limitation of branch predictors: A case for multithreaded architectures
    Golla, PN
    Lin, EC
    [J]. PROCEEDINGS IEEE SOUTHEASTCON '98: ENGINEERING FOR A NEW ERA, 1998, : 138 - 143
  • [40] Workshop on multithreaded architectures and applications-MTAAP
    Derose, Luiz
    [J]. Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013, 2013,