Optimizing Irregular Shared-Memory Applications for Clusters

被引:0
|
作者
Min, Seung-Jai [1 ]
Eigenmann, Rudolf [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
关键词
Compiler Analysis; Runtime Techniques; OpenMP; MPI; Irregular Data Accesses; Performance;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Irregular applications pose challenges in optimizing communication, due to the difficulty of analyzing irregular data accesses accurately and efficiently. This challenge is especially big when translating irregular shared-memory applications to message-passing form for clusters. The lack of effective irregular data analysis in the translation system results in unnecessary or redundant communication, which limits application scalability. In this paper, we present a Lean Distributed Shared Memory (LDSM) system, which features a fast and accurate irregular data access (IDA) analysis. The analysis uses a region-based diff method and makes use of a runtime library that is optimized for irregular applications. We describe three optimizations that improve the LDSM system performance. A parallel array reduction transformation reduces overheads in the analysis. A packed communication optimization and a differential communication optimization effectively eliminate unnecessary and redundant messages. We evaluate the performance of the optimized LDSM system on a set of representative irregular benchmarks. The optimized LDSM executes irregular applications on average 45% faster than the hand-tuned MPI applications.
引用
收藏
页码:256 / 265
页数:10
相关论文
共 50 条
  • [1] Shared-memory implementation of an irregular particle simulation method
    Rauber, Thomas
    Runger, Gudula
    Scholtes, Carsten
    Lecture Notes in Computer Science, 1123
  • [2] VIRTUAL SHARED-MEMORY PROGRAMMING ON WORKSTATION CLUSTERS
    PFENNING, JT
    BACHEM, A
    MINNICH, R
    FUTURE GENERATION COMPUTER SYSTEMS, 1995, 11 (4-5) : 387 - 399
  • [3] A programming interface for NUMA shared-memory clusters
    Dormanns, M
    Sprangers, W
    Ertl, H
    Bemmerl, T
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1997, 1225 : 698 - 707
  • [4] Software Distributed Shared Memory with Transactional Coherence A software engine to run transactional shared-memory parallel applications on clusters
    Di Santo, Michele
    Ranaldo, Nadia
    Sementa, Carmine
    Zimeo, Eugenio
    PROCEEDINGS OF THE 18TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2010, : 175 - 179
  • [5] Optimizing Shared-Memory Hyperheuristics on top of Parameterized Metaheuristics
    Cutillas-Lozano, Jose-Matias
    Gimenez, Domingo
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 20 - 29
  • [6] Optimizing compiler for shared-memory multiple SIMD architecture
    Zhang, Weihua
    Qian, Xinglong
    Wang, Ye
    Zang, Binyu
    Zhu, Chuanqi
    ACM SIGPLAN NOTICES, 2006, 41 (07) : 199 - 208
  • [7] Interactive volume rendering on clusters of shared-memory multiprocessors
    Palmer, ME
    Taylor, S
    Totty, B
    PARALLEL COMPUTATIONAL FLUID DYNAMICS: IMPLEMENTATIONS AND RESULTS USING PARALLEL COMPUTERS, 1996, : 323 - 330
  • [8] Optimizing Map Reduce with Low Memory Requirements for Shared-Memory Systems
    Zheng, Yasong
    Xu, Yuanchao
    Meng, Haibo
    Ye, Xiaochun
    Fan, Lingjun
    Miao, Futao
    Fan, Dongrui
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 213 - 218
  • [9] OPTIMIZING SIMULATION ON SHARED-MEMORY PLATFORMS: THE SMART CITIES CASE
    Ianni, Mauro
    Marotta, Romolo
    Cingolani, Davide
    Pellegrini, Alessandro
    Quaglia, Francesco
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 1969 - 1980
  • [10] Configuration independent analysis for characterizing shared-memory applications
    Abandah, GA
    Davidson, ES
    FIRST MERGED INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, 1998, : 485 - 491