An Experimental Comparison of Cache-oblivious and Cache-conscious Programs

被引:0
|
作者
Yotov, Kamen [1 ]
Roeder, Tom [1 ]
Pingali, Keshav [1 ]
Gunnels, John [2 ]
Gustavson, Fred [2 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
[2] IBM Corp, TJ Watson Res Ctr, Armonk, NY 10504 USA
基金
美国国家科学基金会;
关键词
Memory hierarchy; Memory Latency; Memory bandwidth; Cache-oblivious algorithms; Cache-conscious algorithms; Numerical Software;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cache-oblivious algorithms have been advanced as a way of circumventing some of the difficulties of optimizing applications to take advantage of the memory hierarchy of modern microprocessors. These algorithms are based on the divide-and-conquer paradigm - each division step creates sub-problems of smaller size, and when the working set of a sub-problem fits in some level of the memory hierarchy, the computations in that sub-problem can be executed without suffering capacity misses at that level. In this way, divide-and-conquer algorithms adapt automatically to all levels of the memory hierarchy; in fact, for problems like matrix multiplication, matrix transpose, and FFT, these recursive algorithms are optimal to within constant factors for some theoretical models of the memory hierarchy. An important question is the following: how well do carefully tuned cache-oblivious programs perform compared to carefully tuned cache-conscious programs for the same problem? Is there a price for obliviousness, and if so, how much performance do we lose? Somewhat surprisingly, there are few studies in the literature that have addressed this question. This paper reports the results of such a study in the domain of dense linear algebra. Our main finding is that in this domain; even highly optimized cache-oblivious programs perform significantly worse than corresponding cache-conscious programs. We provide insights into why this is so, and suggest research directions for making cache-oblivious algorithms more competitive.
引用
收藏
页码:93 / +
页数:2
相关论文
共 50 条
  • [1] Cache-Oblivious Hashing
    Rasmus Pagh
    Zhewei Wei
    Ke Yi
    Qin Zhang
    Algorithmica, 2014, 69 : 864 - 883
  • [2] Cache-Oblivious Hashing
    Pagh, Rasmus
    Wei, Zhewei
    Yi, Ke
    Zhang, Qin
    PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 297 - 304
  • [3] Cache-Oblivious Persistence
    Davoodi, Pooya
    Fineman, Jeremy T.
    Iacono, John
    Oezkan, Oezguer
    ALGORITHMS - ESA 2014, 2014, 8737 : 296 - 308
  • [4] Cache-oblivious algorithms
    Leiserson, CE
    ALGORITHMS AND COMPLEXITY, PROCEEDINGS, 2003, 2653 : 5 - 5
  • [5] Cache-Oblivious Algorithms
    Frigo, Matteo
    Leiserson, Charles E.
    Prokop, Harald
    Ramachandran, Sridhar
    ACM TRANSACTIONS ON ALGORITHMS, 2012, 8 (01)
  • [6] Cache-Oblivious Hashing
    Pagh, Rasmus
    Wei, Zhewei
    Yi, Ke
    Zhang, Qin
    ALGORITHMICA, 2014, 69 (04) : 864 - 883
  • [7] Cache-oblivious computation: Algorithms and experimental evaluation
    Ramachandran, Vijaya
    ICCTA 2007: INTERNATIONAL CONFERENCE ON COMPUTING: THEORY AND APPLICATIONS, PROCEEDINGS, 2007, : 20 - 25
  • [8] The Cost of Cache-Oblivious Searching
    Michael A. Bender
    Gerth Stølting Brodal
    Rolf Fagerberg
    Dongdong Ge
    Simai He
    Haodong Hu
    John Iacono
    Alejandro López-Ortiz
    Algorithmica, 2011, 61 : 463 - 505
  • [9] Cache-aware and cache-oblivious adaptive sorting
    Brodal, GS
    Fagerberg, R
    Moruz, G
    AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 576 - 588
  • [10] Is cache-oblivious DGEMM viable?
    Gunnels, John A.
    Gustavson, Fred G.
    Pingali, Keshav
    Yotov, Kamen
    APPLIED PARALLEL COMPUTING: STATE OF THE ART IN SCIENTIFIC COMPUTING, 2007, 4699 : 919 - +