New tiling techniques to improve cache temporal locality

被引:53
|
作者
Song, YH [1 ]
Li, ZY [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
caches; loop transformations; optimizing compilers;
D O I
10.1145/301631.301668
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Tiling is a well-known loop transformation to improve temporal locality of nested loops. Current compiler algorithms for tiling are Limited to loops which are perfectly nested or can be transformed, in trivial ways, into a perfect nest. This paper presents a number of program transformations to enable tiling for a class of nontrivial imperfectly-nested loops such that cache locality is improved. We define a program model for such loops and develop compiler algorithms for their tiling. We propose to adopt odd-even variable duplication to break anti- and output dependences without unduly increasing the working-set size, and to adopt speculative execution to enable tiling of loops which may terminate prematurely due to, e.g, convergence tests in iterative algorithms. We have implemented these techniques in a research compiler, Panorama. Initial experiments with several benchmark programs are performed on SGI workstations based on MIPS R5K and R10K processors. Overall, the transformed programs run faster by 9% to 164%.
引用
收藏
页码:215 / 228
页数:14
相关论文
共 50 条
  • [21] Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management
    Vasilios, Kelefouras
    Georgios, Keramidas
    Nikolaos, Voros
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2018, 17 (03)
  • [22] IMPROVING THE CACHE LOCALITY OF MEMORY ALLOCATION
    GRUNWALD, D
    ZORN, B
    HENDERSON, R
    [J]. SIGPLAN NOTICES, 1993, 28 (06): : 177 - 186
  • [23] Cache Locality Optimization for Recursive Programs
    Lifflander, Jonathan
    Krishnamoorthy, Sriram
    [J]. ACM SIGPLAN NOTICES, 2017, 52 (06) : 1 - 16
  • [24] Data Locality Exploitation in Cache Compression
    Zeng, Qi
    Jha, Rakesh
    Chen, Shigang
    Peir, Jih-Kwon
    [J]. 2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), 2018, : 347 - 354
  • [25] A locality aware cache diffusion system
    Casey, John
    Zhou, Wanlei
    [J]. JOURNAL OF SUPERCOMPUTING, 2010, 52 (01): : 1 - 22
  • [26] Supporting cache locality optimization with a toolset
    Tao, Jie
    Karl, Wolfgang
    [J]. EURO-PAR 2006 PARALLEL PROCESSING, 2006, 4128 : 25 - 34
  • [27] A locality aware cache diffusion system
    John Casey
    Wanlei Zhou
    [J]. The Journal of Supercomputing, 2010, 52 : 1 - 22
  • [28] Static locality analysis for Cache management
    Sanchez, FJ
    Gonzalez, A
    Valero, M
    [J]. 1997 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 1997, : 261 - 271
  • [29] Cache resident data locality analysis
    Samdani, QG
    Thornton, MA
    [J]. 8TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, PROCEEDINGS, 2000, : 539 - 546
  • [30] Reuse-driven tiling for improving data locality
    Xue, JL
    Huang, CH
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1998, 26 (06) : 671 - 696