Prefetching Irregular References for Software Cache on Cell

被引:0
|
作者
Chen, Tong [1 ]
Zhang, Tao [1 ]
Sura, Zehra [1 ]
Tallada, Marc Gonzalez [1 ]
O'Brien, Kathryn [1 ]
O'Brien, Kevin [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY USA
关键词
DMA; prefetch; software cache;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The IBM Single Source Research Compiler for the Cell processor (the SSC Research Compiler) was developed to manage the complexity of programming the heterogeneous multi-core Cell processor. The compiler accepts conventional source programs as input, and automatically generates binaries that execute on both the PPU and SPU cores available on a Cell chip. The compiler uses a software cache and direct buffers to manage data in the small local memory of SPUs. However, irregular references, such as a[ind[i]], often become performance bottle-necks. These references are accessed through software cache, usually with high miss rates. To solve this problem, we propose a method to prefetch irregular references accessed through a software cache that is built; upon hardware such as Cell. This method includes code transformation in the compiler and a runtime library component for the software cache. Our design simplifies the synchronization required when prefetching into software cache, overlaps DMA operations for misses, and avoids frequent context switching to the miss handler. It also minimizes the cache pollution caused by prefetching, by looking both forwards and backwards through the sequence of addresses to be prefetched. We evaluated our prefetching method using the NAS benchmarks. We found that when applicable, our prefetching can improve the performance of some benchmarks by 2 times oil average, and by close to 4 times in the best case. We also present data to show the impact of different configurations and optimizations when prefetching in a software cache.
引用
收藏
页码:155 / 164
页数:10
相关论文
共 50 条
  • [31] BERT4Cache: a bidirectional encoder representations for data prefetching in cache
    Shang, Jing
    Wu, Zhihui
    Xiao, Zhiwen
    Zhang, Yifei
    Wang, Jibin
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [32] Data prefetching for non-linear memory references
    Chi, CH
    Cheung, CM
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 757 - 765
  • [33] BERT4Cache: a bidirectional encoder representations for data prefetching in cache
    Shang, Jing
    Wu, Zhihui
    Xiao, Zhiwen
    Zhang, Yifei
    Wang, Jibin
    PeerJ Computer Science, 2024, 10 : 1 - 21
  • [34] Graph4Cache: A Graph Neural Network Model for Cache Prefetching
    Shang, Jing
    Wu, Zhihui
    Xiao, Zhiwen
    Zhang, Yifei
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (08): : 1945 - 1956
  • [35] Software data prefetching for software pipelined loops
    Sánchez, J
    González, A
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 58 (02) : 236 - 259
  • [36] Efficient Metadata Management for Irregular Data Prefetching
    Wu, Hao
    Nathella, Krishnendra
    Sunwoo, Dam
    Jain, Akanksha
    Lin, Calvin
    PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 449 - 461
  • [37] Resource Conscious Prefetching for Irregular Applications in Multicores
    Khan, Muneeb
    Hagersten, Erik
    2014 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION (SAMOS XIV), 2014, : 34 - 43
  • [38] Broadcast based cache invalidation and prefetching in mobile environment
    Chand, N
    Joshi, R
    Misra, M
    HIGH PERFORMANCE COMPUTING - HIPC 2004, 2004, 3296 : 410 - 419
  • [39] A miss history-based architecture for cache prefetching
    Phalke, V
    Gopinath, B
    MEMORY MANAGEMENT, 1995, 986 : 381 - 398
  • [40] Studying interactions between prefetching and cache line turnoff
    Kadayif, Ismail
    Kandemir, Mahmut
    Chen, Guilin
    ASP-DAC 2005: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2005, : 545 - 548