Improving cache locality with blocked array layouts

被引:1
|
作者
Athanasaki, E [1 ]
Koziris, N [1 ]
机构
[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Comp Syst Lab, GR-15773 Zografos, Greece
关键词
D O I
10.1109/EMPDP.2004.1271460
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Minimizing cache misses is one of the most important factors to reduce average latency for memory accesses. Tiled codes modify the instruction stream to exploit cache locality for array accesses. In this paper we further reduce cache misses, restructuring the memory layout of multidimensional arrays, that are accessed by tiled instruction code. In our method, array elements are stored in a blocked way, exactly as they are swept by the tiled instruction stream. We present a straightforward way to easily translate multidimensional indexing of arrays into their blocked memory layout using simple binary-mask operations. Indices for such array layouts are easily calculated based on the algebra of dilated integers, similarly to morton-order indexing. Actual experimental results, using matrix multiplication and LU-decomposition on various size arrays, illustrate that execution time is greatly improved when combining tiled code with tiled array layouts and binary mask-based index translation functions. Simulations using the Simplescalar tool, verify that enhanced performance is due to the considerable reduction of total cache misses.
引用
收藏
页码:308 / 317
页数:10
相关论文
共 50 条
  • [31] A locality aware cache diffusion system
    Casey, John
    Zhou, Wanlei
    JOURNAL OF SUPERCOMPUTING, 2010, 52 (01): : 1 - 22
  • [32] A locality aware cache diffusion system
    John Casey
    Wanlei Zhou
    The Journal of Supercomputing, 2010, 52 : 1 - 22
  • [33] Supporting cache locality optimization with a toolset
    Tao, Jie
    Karl, Wolfgang
    EURO-PAR 2006 PARALLEL PROCESSING, 2006, 4128 : 25 - 34
  • [34] Static locality analysis for Cache management
    Sanchez, FJ
    Gonzalez, A
    Valero, M
    1997 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 1997, : 261 - 271
  • [35] Cache resident data locality analysis
    Samdani, QG
    Thornton, MA
    8TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, PROCEEDINGS, 2000, : 539 - 546
  • [36] Social Based Layouts for the Increase of Locality in Graph Operations
    Prat-Perez, Arnau
    Dominguez-Sal, David
    Larriba-Pey, Josep L.
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, 2011, 6587 : 558 - 569
  • [37] Cache-efficient layouts of bounding volume hierarchies
    Yoon, Sung-Eui
    Manocha, Dinesh
    COMPUTER GRAPHICS FORUM, 2006, 25 (03) : 507 - 516
  • [38] Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing
    Gupta, Saurabh
    Zhou, Huiyang
    2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 150 - 159
  • [39] Reuse locality aware cache partitioning for last-level cache
    Shen, Fanfan
    He, Yanxiang
    Zhang, Jun
    Li, Qingan
    Li, Jianhua
    Xu, Chao
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 74 : 319 - 330
  • [40] IMPROVING GEAR TOOTH LAYOUTS
    THOEN, RL
    MACHINE DESIGN, 1983, 55 (04) : 100 - &