Efficient Inspected Critical Sections in Data-Parallel GPU Codes

被引:0
|
作者
Blass, Thorsten [1 ]
Philippsen, Michael [1 ]
Veldema, Ronald [1 ]
机构
[1] Friedrich Alexander Univ, Programming Syst Grp, Erlangen, Germany
关键词
GPGPU; CUDA; SIMT; Critical section; Mutual exclusion;
D O I
10.1007/978-3-030-35225-7_15
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Optimistic concurrency control and STMs rely on the assumption of sparse conflicts. For data-parallel GPU codes with many or with dynamic data dependences, a pessimistic and lock-based approach may be faster, if only GPUs would offer hardware support for GPU-wide fine-grained synchronization. Instead, current GPUs inflict dead- and livelocks on attempts to implement such synchronization in software. The paper demonstrates how to build GPU-wide non-hanging critical sections that are as easy to use as STMs but also get close to the performance of traditional fine-grained locks. Instead of sequentializing all threads that enter a critical section, the novel programmer-guided Inspected Critical Sections (ICS) keep the degree of parallelism up. As in optimistic approaches threads that are known not to interfere, may execute the body of the inspected critical section concurrently.
引用
收藏
页码:223 / 239
页数:17
相关论文
共 50 条
  • [1] Data-Parallel Hashing Techniques for GPU Architectures
    Lessley, Brenton
    Childs, Hank
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (01) : 237 - 250
  • [2] Iterative Data-parallel Mark&Sweep on a GPU
    Veldema, Ronald
    Philippsen, Michael
    [J]. ACM SIGPLAN NOTICES, 2011, 46 (11) : 1 - 10
  • [3] Data-parallel language features for sparse codes
    Ujaldon, M
    Zapata, EL
    Chapman, BM
    Zima, HP
    [J]. LANGUAGES, COMPILERS AND RUN-TIME SYSTEMS FOR SCALABLE COMPUTERS, 1996, : 253 - 264
  • [4] Efficient CPU-GPU Work Sharing for Data-Parallel Java']JavaScript Workloads
    Piao, Xianglan
    Kim, Channoh
    Oh, Younghwan
    Kim, Hanjun
    Lee, Jae W.
    [J]. WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 357 - 358
  • [5] The development of the data-parallel GPU programming language CGiS
    Lucas, Philipp
    Fritz, Nicolas
    Wilhelm, Reinhard
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, 2006, 3994 : 200 - 203
  • [6] Data-Parallel Implementation of Reconfigurable Digital Predistortion on a Mobile GPU
    Ghazi, Amanullah
    Boutellier, Jani
    Anttila, Lauri
    Juntti, Markku
    Valkama, Mikko
    [J]. 2015 49TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2015, : 186 - 191
  • [7] Efficient conditional operations for data-parallel architectures
    Kapasi, UJ
    Dally, WJ
    Rixner, S
    Mattson, PR
    Owens, JD
    Khailany, B
    [J]. 33RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE: MICRO-33 2000, PROCEEDINGS, 2000, : 159 - 170
  • [8] Efficient Data-parallel Computations on Distributed Systems
    曾志勇
    [J]. High Technology Letters, 2002, (03) : 92 - 96
  • [9] Efficient Data-Parallel Primitives on Heterogeneous Systems
    Lai, Zhuohang
    Luo, Qiong
    Xie, Xiaolong
    [J]. PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [10] A Survey on Parallel Computing and its Applications in Data-Parallel Problems Using GPU Architectures
    Navarro, Cristobal A.
    Hitschfeld-Kahler, Nancy
    Mateu, Luis
    [J]. COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2014, 15 (02) : 285 - 329