ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs

被引:25
|
作者
Liu, Xu [1 ]
Wu, Bo [2 ]
机构
[1] Coll William & Mary, Dept Comp Sci, Williamsburg, VA 23185 USA
[2] Colorado Sch Mines, Dept EECS, Golden, CO 80401 USA
关键词
Memory bottlenecks; scalability; parallel profiler;
D O I
10.1145/2807591.2807648
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is difficult to scale parallel programs in a system that employs a large number of cores. To identify scalability bottlenecks, existing tools principally pinpoint poor thread synchronization strategies or unnecessary data communication. Memory subsystem is one of the key contributors to poor parallel scaling in multicore machines. State-of-theart tools, however, either lack sophisticated capabilities or are completely ignorant in pinpointing scalability bottlenecks arising from the memory subsystem. To address this issue, we develop a tool ScaAnalyzer to pinpoint scaling losses due to poor memory access behaviors of parallel programs. ScaAnalyzer collects, attributes, and analyzes memory-related metrics during program execution while incurring very low overhead. ScaAnalyzer provides high-level, detailed guidance to programmers for scalability optimization. We demonstrate the utility of ScaAnalyzer with case studies of three parallel programs. For each benchmark, ScaAnalyzer identifies scalability bottlenecks caused by poor memory access behaviors and provides optimization guidance that yields significant improvement in scalability.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Memory optimization for parallel functional programs
    Sinharoy, B
    Szymanski, B
    COMPUTING SYSTEMS IN ENGINEERING, 1995, 6 (4-5): : 415 - 422
  • [22] Hierarchical Memory Management for Parallel Programs
    Raghunathan, Ram
    Muller, Stefan K.
    Acar, Umut A.
    Blelloch, Guy
    ACM SIGPLAN NOTICES, 2016, 51 (09) : 392 - 406
  • [23] Memory efficiency of parallel programs and memory bounded speedup
    Kartawidjaja, MA
    Hoekstra, AG
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 766 - 775
  • [24] A Performance Analysis Tool for PVM Parallel Programs
    Chen Wang 1
    计算机工程与应用, 2004, (29) : 103 - 105
  • [25] An effective tool for debugging races in parallel programs
    Kim, DG
    Jun, YK
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-III, PROCEEDINGS, 1997, : 117 - 126
  • [26] PARSIM - A TOOL FOR THE ANALYSIS OF PARALLEL AND DISTRIBUTED PROGRAMS
    SCHNEKENBURGER, T
    FRIEDRICH, M
    WEININGER, A
    SCHOEN, T
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 634 : 689 - 700
  • [27] SCALEA: a performance analysis tool for parallel programs
    Truong, HL
    Fahringer, T
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2003, 15 (11-12): : 1001 - 1025
  • [28] TUNING MEMORY PERFORMANCE OF SEQUENTIAL AND PARALLEL PROGRAMS
    MARTONOSI, M
    GUPTA, A
    ANDERSON, TE
    COMPUTER, 1995, 28 (04) : 32 - 40
  • [29] Guaranteeing good memory bounds for parallel programs
    Burton, FW
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) : 762 - 773
  • [30] ON THE DESIGN OF PARALLEL PROGRAMS FOR MACHINES WITH DISTRIBUTED MEMORY
    GOMM, D
    HECKNER, M
    LANGE, KJ
    RIEDLE, G
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 487 : 381 - 391