PERFORMANCE BENEFITS AND LIMITATIONS OF LARGE NUMA MULTIPROCESSORS

被引:3
|
作者
SEVCIK, KC [1 ]
ZHOU, SN [1 ]
机构
[1] UNIV TORONTO, COMP SYST RES INST, TORONTO M5S 1A1, ONTARIO, CANADA
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1016/0166-5316(94)90013-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In scalable multiprocessor architectures, the times required for a processor to access various portions of memory are different. In this paper, we consider how this characteristic affects performance by comparing it to the ideal but unrealizable case in which the access times to all memory modules can be kept constant, even as the number of processors is increased. We examine several application kernels to investigate how well they would execute on various instances of NUMA systems with a hierarchical memory structure. The results of our analytic model show that access locality is much more important in NUMA architectures than it is in UMA architectures. The extent of the performance penalty of non-local memory accesses depends on the variability in access times to various parts of shared memory, as well as on the amount of congestion in the interconnection network that provides access to remote memory modules. In the applications we examined, we found that it is possible to partition and locate both the data and the computation in such a way that reasonable speedups can be achieved on NUMA systems.
引用
收藏
页码:185 / 205
页数:21
相关论文
共 50 条
  • [1] Clustered affinity scheduling on large-scale NUMA multiprocessors
    Wang, YM
    Wang, HH
    Chang, RC
    JOURNAL OF SYSTEMS AND SOFTWARE, 1997, 39 (01) : 61 - 70
  • [2] Performance evaluation of cache depot on CC-NUMA multiprocessors
    Hsiao, HC
    King, CT
    1998 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 519 - 526
  • [3] Performance evaluation for CC-NUMA multiprocessors using an OLTP workload
    Chung, YW
    Kim, H
    Park, JW
    Lee, K
    MICROPROCESSORS AND MICROSYSTEMS, 2001, 25 (04) : 221 - 229
  • [4] Performance evaluation of two-level scheduling algorithms for NUMA multiprocessors
    Nara Inst of Science and Technology, Ikoma, Japan
    Syst Comput Jpn, 2 (36-46):
  • [5] Performance evaluation of home-cluster based scheduling for NUMA multiprocessors
    Koita, T
    Katayama, T
    Saisho, K
    Fukuda, A
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 1939 - 1945
  • [6] PAGE PLACEMENT POLICIES FOR NUMA MULTIPROCESSORS
    LAROWE, RP
    ELLIS, CS
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 11 (02) : 112 - 129
  • [7] On the design of a high-performance adaptive router for CC-NUMA multiprocessors
    Puente, V
    Gregorio, JA
    Beivide, R
    Izu, C
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2003, 14 (05) : 487 - 501
  • [8] Classifying and alleviating the communication overheads in matrix computations on large-scale NUMA multiprocessors
    Wang, YM
    Wang, HH
    Chang, RC
    JOURNAL OF SYSTEMS AND SOFTWARE, 1998, 44 (01) : 17 - 29
  • [9] Exploiting Network Locality for CC-NUMA Multiprocessors
    Hung-Chang Hsiao
    Chung-Ta King
    The Journal of Supercomputing, 2001, 18 : 63 - 87
  • [10] EXPERIMENTAL COMPARISON OF MEMORY MANAGEMENT POLICIES FOR NUMA MULTIPROCESSORS
    LAROWE, RP
    ELLIS, CS
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1991, 9 (04): : 319 - 363