A memory-layout oriented run-time technique for locality optimization on SMPs

被引:3
|
作者
Yan, Y [1 ]
Zhang, XD [1 ]
Zhang, Z [1 ]
机构
[1] HAL Comp Syst Inc, Campbell, CA 95008 USA
关键词
D O I
10.1109/ICPP.1998.708484
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout oriented approach to exploit cache locality for parallel loops at run-time on Symmetric Multi-Processor (SMP) systems. Guided by application-dependent hints and the targeted cache architecture, it reorganizes and partitions a parallel loop through shrinking and partitioning the memory-access space of the loop at run-time. In the generated task partitions, the data sharing among partitions is minimized and the data reuse in a partition is maximized. The execution of tasks in partitions is scheduled in an adaptive and locality-preserved way to achieve balanced execution, for minimizing the execution time of applications by trading offload balance and locality Based on simulation and measurement, we show our run-time approach can achieve comparable performance with the compiler optimizations for two applications, whose load balance and cache locality can be well optimized by the tiling and other program transformations. However our experimental results also show that our approach is able to significantly improve the memory performance for the applications with dynamic memory access patterns. This type of programs are usually hard to be optimized by compilers.
引用
收藏
页码:189 / 196
页数:8
相关论文
共 50 条
  • [1] Run-time spatial locality detection and optimization
    Johnson, TL
    Merten, MC
    Hwu, WW
    [J]. THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 57 - 64
  • [2] An introduction to Balder - An OpenMP run-time library for clusters of SMPs
    Karlsson, Sven
    [J]. OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2008, 4315 : 78 - 91
  • [3] Run-Time Technique for Simultaneous Aging and Power Optimization in GPGPUs
    Chen, Xiaoming
    Wang, Yu
    Liang, Yun
    Xie, Yuan
    Yang, Huazhong
    [J]. 2014 51ST ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2014,
  • [4] Exploiting locality in the run-time parallelization of irregular loops
    Martín, MJ
    Singh, DE
    Touriño, J
    Rivera, FF
    [J]. 2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDING, 2002, : 27 - 34
  • [5] MemStep: An Interactive Tool for Constructing and Visualizing the Run-Time Memory Layout of Java Programs
    Pham, Michelle Le
    Nguyen, Anna
    Schreib, Rebecca
    [J]. Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, 1 : 541 - 547
  • [6] Run-time memory optimization for DDMB architecture through a CCB algorithm
    Cho, Jeonghun
    Paek, Yunheung
    [J]. EMERGING DIRECTIONS IN EMBEDDED AND UBIQUITOUS COMPUTING, 2006, 4097 : 775 - 784
  • [7] A run-time memory protection methodology
    Seshua, Udaya
    Bussa, Nagaraju
    Vermeulen, Bart
    [J]. PROCEEDINGS OF THE ASP-DAC 2007, 2007, : 498 - +
  • [8] A Brouwerian Model of the Run-Time Memory
    Yang, Wuu
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2015, 31 (06) : 2103 - 2124
  • [9] FINDING RUN-TIME MEMORY ERRORS
    NELSON, T
    [J]. DR DOBBS JOURNAL, 1993, 18 (12): : 34 - &
  • [10] ROX: Run-time Optimization of XQueries
    Kader, Riham Abdel
    Boncz, Peter
    Manegold, Stefan
    van Keulen, Maurice
    [J]. ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 615 - 626