Prefetching Irregular References for Software Cache on Cell

被引:0
|
作者
Chen, Tong [1 ]
Zhang, Tao [1 ]
Sura, Zehra [1 ]
Tallada, Marc Gonzalez [1 ]
O'Brien, Kathryn [1 ]
O'Brien, Kevin [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY USA
关键词
DMA; prefetch; software cache;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The IBM Single Source Research Compiler for the Cell processor (the SSC Research Compiler) was developed to manage the complexity of programming the heterogeneous multi-core Cell processor. The compiler accepts conventional source programs as input, and automatically generates binaries that execute on both the PPU and SPU cores available on a Cell chip. The compiler uses a software cache and direct buffers to manage data in the small local memory of SPUs. However, irregular references, such as a[ind[i]], often become performance bottle-necks. These references are accessed through software cache, usually with high miss rates. To solve this problem, we propose a method to prefetch irregular references accessed through a software cache that is built; upon hardware such as Cell. This method includes code transformation in the compiler and a runtime library component for the software cache. Our design simplifies the synchronization required when prefetching into software cache, overlaps DMA operations for misses, and avoids frequent context switching to the miss handler. It also minimizes the cache pollution caused by prefetching, by looking both forwards and backwards through the sequence of addresses to be prefetched. We evaluated our prefetching method using the NAS benchmarks. We found that when applicable, our prefetching can improve the performance of some benchmarks by 2 times oil average, and by close to 4 times in the best case. We also present data to show the impact of different configurations and optimizations when prefetching in a software cache.
引用
收藏
页码:155 / 164
页数:10
相关论文
共 50 条
  • [41] Design considerations of high performance data cache with prefetching
    Chi, CH
    Yuan, YL
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 1243 - 1250
  • [42] Extending the MEC Mobility Service to Support Cache Prefetching
    Lentisco, Carlos M.
    Bellido, Luis
    Gonzalez-Sanchez, Daniel
    Martinez-Casanueva, Ignacio D.
    Fernandez, David
    Soto, Ignacio
    2022 18TH INTERNATIONAL CONFERENCE ON THE DESIGN OF RELIABLE COMMUNICATION NETWORKS (DRCN), 2022,
  • [43] Hardware prefetching techniques for cache memories in multimedia applications
    Cucchiara, R
    Piccardi, M
    Prati, A
    5TH INTERNATIONAL WORKSHOP ON COMPUTER ARCHITECTURES FOR MACHINE PERCEPTION, PROCEEDINGS, 2000, : 311 - 319
  • [44] Temporal analysis of cache prefetching strategies for multimedia applications
    Cucchiara, R
    Piccardi, M
    Prati, A
    CONFERENCE PROCEEDINGS OF THE 2001 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, 2001, : 311 - 318
  • [45] Neighbor cache prefetching for multimedia image and video processing
    Cucchiara, R
    Piccardi, M
    Prati, A
    IEEE TRANSACTIONS ON MULTIMEDIA, 2004, 6 (04) : 539 - 552
  • [46] VM-aware Adaptive Storage Cache Prefetching
    Matsuzawa, Keiichi
    Shinagawa, Takahiro
    2017 9TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2017, : 65 - 73
  • [47] Effective cache prefetching on bus-based multiprocessors
    ACM Trans Comput Syst, 1 (57):
  • [48] EFFECTIVE CACHE PREFETCHING ON BUS-BASED MULTIPROCESSORS
    TULLSEN, DM
    EGGERS, SJ
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1995, 13 (01): : 57 - 88
  • [49] CPU cache prefetching: Timing evaluation of hardware implementations
    Tse, J
    Smith, AJ
    IEEE TRANSACTIONS ON COMPUTERS, 1998, 47 (05) : 509 - 526
  • [50] Hardware prefetching techniques for cache memories in multimedia applications
    Cucchiara, R.
    Piccardi, M.
    Prati, A.
    2000, IEEE, Piscataway, NJ, United States