Transactional Prefetching: Narrowing the Window of Contention in Hardware Transactional Memory

被引:0
|
作者
Negi, Anurag [1 ]
Armejach, Adria [2 ,3 ]
Cristal, Adrian [2 ,4 ]
Unsal, Osman S. [2 ]
Stenstrom, Per [1 ]
机构
[1] Chalmers Univ Technol, Gothenburg, Sweden
[2] Barcelona Supercomp Ctr, Barcelona, Spain
[3] Univ Politecn Cataluna, Barcelona, Spain
[4] Spanish Natl Res Council, CSIC, IIIA, Barcelona, Spain
关键词
hardware transactional memory; multicores; prefetching;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Memory access latency is the primary performance bottle-neck in modern computer systems. Prefetching data before it is needed by a processing core allows substantial performance gains by overlapping significant portions of memory latency with useful work. Prior work has investigated this technique and measured potential benefits in a variety of scenarios. However, its use in speeding up Hardware Transactional Memory (HTM) has remained hitherto unexplored. In several HTM designs transactions invalidate speculatively updated cache lines when they abort. Such cache lines tend to have high locality and are likely to be accessed again when the transaction re-executes. Coarse grained transactions that update several cache lines are particularly susceptible to performance degradation even under moderate contention. However, such transactions show strong locality of reference, especially when contention is high. Prefetching cache lines with high locality can, therefore, improve overall concurrency by speeding up transactions and, thereby, narrowing the window of time in which such transactions persist and can cause contention. Such transactions are important since they are likely to form a common TM use-case. We note that traditional prefetch techniques may not be able to track such lines adequately or issue prefetches quickly enough. This paper investigates the use of prefetching in HTMs, proposing a simple design to identify and request prefetch candidates, and measures performance gains to be had for several representative TM workloads.
引用
收藏
页码:181 / 190
页数:10
相关论文
共 50 条
  • [1] Analysing software prefetching opportunities in hardware transactional memory
    Marina Shimchenko
    Rubén Titos-Gil
    Ricardo Fernández-Pascual
    Manuel E. Acacio
    Stefanos Kaxiras
    Alberto Ros
    Alexandra Jimborean
    The Journal of Supercomputing, 2022, 78 : 919 - 944
  • [2] Analysing software prefetching opportunities in hardware transactional memory
    Shimchenko, Marina
    Titos-Gil, Ruben
    Fernandez-Pascual, Ricardo
    Acacio, Manuel E.
    Kaxiras, Stefanos
    Ros, Alberto
    Jimborean, Alexandra
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (01): : 919 - 944
  • [3] A Comprehensive Scheme for Contention Management in Hardware Transactional Memory
    Wang, Xiaoqun
    Ji, Zhenzhou
    Fu, Chen
    Hu, Mingzeng
    INFORMATION AND AUTOMATION, 2011, 86 : 397 - 403
  • [4] GCMS: A Global Contention Management Scheme in Hardware Transactional Memory
    Wang, Xiaoqun
    Ji, Zhenzhou
    Fu, Chen
    Hu, Mingzeng
    IEEE COMPUTER ARCHITECTURE LETTERS, 2011, 10 (01) : 24 - 27
  • [5] Window-Based Greedy Contention Management for Transactional Memory
    Sharma, Gokarna
    Estrade, Brett
    Busch, Gostas
    DISTRIBUTED COMPUTING, 2010, 6343 : 64 - +
  • [6] ZEBRA: Data-Centric Contention Management in Hardware Transactional Memory
    Titos-Gil, Ruben
    Negi, Anurag
    Acacio, Manuel E.
    Garcia, Jose M.
    Stenstrom, Per
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (05) : 1359 - 1369
  • [7] Hardware Transactional Persistent Memory
    Giles, Ellis
    Doshi, Kshitij
    Varman, Peter
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS (MEMSYS 2018), 2018, : 190 - 205
  • [8] Fun with Hardware Transactional Memory
    Herlihy, Maurice
    SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 575 - 575
  • [9] Symbolic Prefetching in Transactional Distributed Shared Memory
    Dash, Alokika
    Demsky, Brian
    PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, : 331 - 332
  • [10] Window-based greedy contention management for transactional memory: theory and practice
    Gokarna Sharma
    Costas Busch
    Distributed Computing, 2012, 25 : 225 - 248