Prefetching Irregular References for Software Cache on Cell

被引:0
|
作者
Chen, Tong [1 ]
Zhang, Tao [1 ]
Sura, Zehra [1 ]
Tallada, Marc Gonzalez [1 ]
O'Brien, Kathryn [1 ]
O'Brien, Kevin [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY USA
关键词
DMA; prefetch; software cache;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The IBM Single Source Research Compiler for the Cell processor (the SSC Research Compiler) was developed to manage the complexity of programming the heterogeneous multi-core Cell processor. The compiler accepts conventional source programs as input, and automatically generates binaries that execute on both the PPU and SPU cores available on a Cell chip. The compiler uses a software cache and direct buffers to manage data in the small local memory of SPUs. However, irregular references, such as a[ind[i]], often become performance bottle-necks. These references are accessed through software cache, usually with high miss rates. To solve this problem, we propose a method to prefetch irregular references accessed through a software cache that is built; upon hardware such as Cell. This method includes code transformation in the compiler and a runtime library component for the software cache. Our design simplifies the synchronization required when prefetching into software cache, overlaps DMA operations for misses, and avoids frequent context switching to the miss handler. It also minimizes the cache pollution caused by prefetching, by looking both forwards and backwards through the sequence of addresses to be prefetched. We evaluated our prefetching method using the NAS benchmarks. We found that when applicable, our prefetching can improve the performance of some benchmarks by 2 times oil average, and by close to 4 times in the best case. We also present data to show the impact of different configurations and optimizations when prefetching in a software cache.
引用
收藏
页码:155 / 164
页数:10
相关论文
共 50 条
  • [1] Adaptive cache line strategy for irregular references on Cell architecture
    Cao Q.
    Hu C.-J.
    Zhang Y.-X.
    Zhu Y.-T.
    Jisuanji Xuebao/Chinese Journal of Computers, 2011, 34 (05): : 898 - 911
  • [2] Adaptive Line Size Cache for Irregular References on Cell Multicore Processor
    Cao, Qian
    Zhao, Chongchong
    Chen, Junxiu
    Zhang, Yunxing
    Chen, Yi
    NETWORK AND PARALLEL COMPUTING, 2010, 6289 : 314 - 328
  • [3] An automated method for software controlled cache prefetching
    Zucker, DF
    Lee, RB
    Flynn, MJ
    PROCEEDINGS OF THE THIRTY-FIRST HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL VII: SOFTWARE TECHNOLOGY TRACK, 1998, : 106 - 114
  • [4] WCET analysis of unified cache with software prefetching
    An, Li-Kui
    Gu, Zhi-Min
    Fu, Yin-Xia
    Zhao, Xin
    Gan, Zhi-Hua
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2015, 35 (07): : 730 - 736
  • [5] Hardware and software cache prefetching techniques for MPEG benchmarks
    Zucker, DF
    Lee, RB
    Flynn, MJ
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2000, 10 (05) : 782 - 796
  • [6] Adaptive software prefetching in scalable multiprocessors using cache information
    Park, D
    Seong, BH
    Saavedra, RH
    PARALLEL COMPUTING, 2001, 27 (09) : 1173 - 1195
  • [7] Interrupt triggered software prefetching for embedded CPU instruction cache
    Batcher, Ken W.
    Walker, Robert A.
    PROCEEDINGS OF THE 12TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, 2006, : 91 - +
  • [8] Cache Prefetching in Embedded DSPs
    Vaintraub, Adiel
    Kahn, Roger
    Weiss, Shlomo
    2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
  • [9] Pointer cache assisted prefetching
    Collins, J
    Sair, S
    Calder, B
    Tullsen, DM
    35TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-35), PROCEEDINGS, 2002, : 62 - 73
  • [10] Reducing cache pollution of prefetching in a small data cache
    Reungsang, P
    Park, SK
    Jeong, SW
    Roh, HL
    Lee, G
    2001 INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD 2001, PROCEEDINGS, 2001, : 530 - 533