Improving cache locality with blocked array layouts

被引:1
|
作者
Athanasaki, E [1 ]
Koziris, N [1 ]
机构
[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Comp Syst Lab, GR-15773 Zografos, Greece
关键词
D O I
10.1109/EMPDP.2004.1271460
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Minimizing cache misses is one of the most important factors to reduce average latency for memory accesses. Tiled codes modify the instruction stream to exploit cache locality for array accesses. In this paper we further reduce cache misses, restructuring the memory layout of multidimensional arrays, that are accessed by tiled instruction code. In our method, array elements are stored in a blocked way, exactly as they are swept by the tiled instruction stream. We present a straightforward way to easily translate multidimensional indexing of arrays into their blocked memory layout using simple binary-mask operations. Indices for such array layouts are easily calculated based on the algebra of dilated integers, similarly to morton-order indexing. Actual experimental results, using matrix multiplication and LU-decomposition on various size arrays, illustrate that execution time is greatly improved when combining tiled code with tiled array layouts and binary mask-based index translation functions. Simulations using the Simplescalar tool, verify that enhanced performance is due to the considerable reduction of total cache misses.
引用
收藏
页码:308 / 317
页数:10
相关论文
共 50 条
  • [41] Improving Tumor Treating Fields Treatment Efficacy in Patients With Glioblastoma Using Personalized Array Layouts
    Wenger, Cornelia
    Salvador, Ricardo
    Basser, Peter J.
    Miranda, Pedro C.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2016, 94 (05): : 1137 - 1143
  • [42] LOCALITY ASPECTS AND CACHE MEMORY UTILITY IN MICROCOMPUTERS
    BURKHARDT, WH
    MICROPROCESSING AND MICROPROGRAMMING, 1989, 26 (01): : 51 - 62
  • [43] Restructuring computations for temporal data cache locality
    Pingali, VK
    McKee, SA
    Hsieh, WC
    Carter, JB
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2003, 31 (04) : 305 - 338
  • [44] Restructuring Computations for Temporal Data Cache Locality
    Venkata K. Pingali
    Sally A. McKee
    Wilson C. Hsieh
    John B. Carter
    International Journal of Parallel Programming, 2003, 31 : 305 - 338
  • [45] Exploiting Cache Locality to Speedup Register Clustering
    Fontana, Tiago Augusto
    Almeida, Sheiny
    Netto, Renan
    Livramento, Vinicius
    Laercio Pilla, Chrystian Guth
    Guntzel, Jose Luis
    2017 30TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN (SBCCI 2017): CHOP ON SANDS, 2017, : 191 - 197
  • [46] An improved task scheduling algorithm based on cache locality and data locality in Hadoop
    Zhang, Peng
    Li, Chunlin
    Zhao, Yahui
    2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 244 - 249
  • [47] A method to estimate locality in the output of a cache level
    Alakarhu, J
    Niittylahti, J
    Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 1439 - 1442
  • [48] Well-structured futures and cache locality
    Herlihy M.
    Liu Z.
    1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (02):
  • [49] Well-Structured Futures and Cache Locality
    Herlihy, Maurice
    Liu, Zhiyu
    ACM SIGPLAN NOTICES, 2014, 49 (08) : 155 - 166
  • [50] On the array embeddings and layouts of quadtrees and pyramids
    Jan, GE
    Leu, SW
    Li, CH
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2004, 20 (01) : 127 - 141