An efficient sequential consistency implementation with dynamic race detection for GPUs

被引:0
|
作者
Tabbakh, Abdulaziz [1 ]
Annavaram, Murali [2 ]
机构
[1] King Fahd Univ Petr & Minerals, Comp Engn Dept, POB 5065, Dhahran 31261, Saudi Arabia
[2] Univ Southern Calif, Elect Engn Dept, 3740 Mlintock Ave, Los Angeles, CA 90089 USA
关键词
Computer architecture; GPU; Memory coherence; Sequential consistency; COHERENCE; OVERHEAD;
D O I
10.1016/j.jpdc.2023.104836
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As GPUs are being used for general purpose computations, applications with different memory access requirements have emerged. In spite of the growing demand, only few GPU coherence protocols and memory models have been explored in research, and even fewer models have been implemented in products. However, in the CPU domain a diverse range of memory models for parallel programming have been proposed, which explore the interplay between performance and programmability. Sequential consistency (SC) is one of the strict memory models. It provides the most programmer intuitive execution of memory operation but it imposes strict ordering restrictions on memory operations that cause performance overhead. Hence, implementing and supporting SC is one of the most challenging tasks in any computing platform, and GPUs are no exception. As such in this paper, we propose a GPU architecture that implements SC memory model with minimal performance and power overhead. We achieve this goal by designing a mechanism to detect races between different streaming multiprocessors (SMs) dynamically at runtime. The race is detected using a signature -based mechanism to keep track of sets of unseen updates for each SM which significantly reduces the hardware implementation cost, with a small increase in invalidation traffic. Our experiments show that dynamic race detection can be used to implement sequential consistency with 5% performance overhead.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Efficient Sequential Consistency in GPUs via Relativistic Cache Coherence
    Ren, Xiaowei
    Lis, Mieszko
    [J]. 2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 625 - 636
  • [2] Efficient Implementation of Hyperspectral Anomaly Detection Techniques on GPUs and Multicore Processors
    Molero, Jose M.
    Garzon, Ester M.
    Garcia, Inmaculada
    Quintana-Orti, Enrique S.
    Plaza, Antonio
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (06) : 2256 - 2266
  • [3] FastTrack: Efficient and Precise Dynamic Race Detection
    Flanagan, Cormac
    Freund, Stephen N.
    [J]. PLDI'09 PROCEEDINGS OF THE 2009 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2009, : 121 - 133
  • [4] FastTrack: Efficient and Precise Dynamic Race Detection
    Flanagan, Cormac
    Freund, Stephen N.
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (11) : 93 - 101
  • [5] FastTrack: Efficient and Precise Dynamic Race Detection
    Flanagan, Cormac
    Freund, Stephen N.
    [J]. ACM SIGPLAN NOTICES, 2009, 44 (06) : 121 - 133
  • [6] Dynamic verification of sequential consistency
    Meixner, A
    Sorin, DJ
    [J]. 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, : 482 - 493
  • [7] Efficient Implementation for QUAD Stream Cipher with GPUs
    Tanaka, Satoshi
    Nishide, Takashi
    Sakurai, Kouichi
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2013, 10 (02) : 897 - 911
  • [8] An Efficient Implementation of a Subgraph Isomorphism Algorithm for GPUs
    Bonnici, Vincenzo
    Giugno, Rosalba
    Bombieri, Nicola
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2674 - 2681
  • [9] Register Efficient Dynamic Memory Allocator for GPUs
    Vinkler, M.
    Havran, V.
    [J]. COMPUTER GRAPHICS FORUM, 2015, 34 (08) : 143 - 154
  • [10] An Efficient Data Structure for Dynamic Graph on GPUs
    Zou, Lei
    Zhang, Fan
    Lin, Yinnian
    Yu, Yanpeng
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11051 - 11066