Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors

被引:32
|
作者
Dahlgren, F [1 ]
Stenstrom, P [1 ]
机构
[1] CHALMERS UNIV TECHNOL, DEPT COMP ENGN, S-41296 GOTHENBURG, SWEDEN
关键词
hardware-controlled prefetching; latency tolerance; performance evaluation; relaxed memory consistency; shared-memory multiprocessors;
D O I
10.1109/71.494633
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the efficiency of previously proposed stride and sequential prefetching-two promising hardware-based prefetching schemes to reduce read-miss penalties in shared-memory multiprocessors. Although stride accesses dominate in four out of six of the applications we study, we find that sequential prefetching does as well as and in same cases even better than stride prefetching for five applications. This is because 1) most strides are shorter than the block size (we assume 32 byte blocks), which means that sequential prefetching is as effective for these stride accesses, and 2) sequential prefetching also exploits the locality of read misses with nonstride accesses. However, since stride prefetching in general results in fewer useless prefetches, it offers the extra advantage of consuming less memory-system bandwidth.
引用
收藏
页码:385 / 398
页数:14
相关论文
共 50 条