A scalable queue for work distribution on GPUs

被引:0
|
作者
Kerbl B. [1 ]
Müller J. [1 ]
Kenzel M. [1 ]
Schmalstieg D. [1 ]
Steinberger M. [1 ]
机构
[1] Kerbl, Bernhard
[2] Müller, Jörg
[3] Kenzel, Michael
[4] Schmalstieg, Dieter
[5] Steinberger, Markus
来源
| 2018年 / Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States卷 / 53期
关键词
concurrent; GPU; parallel; queuing; scheduling;
D O I
10.1145/3178487.3178526
中图分类号
学科分类号
摘要
Harnessing the power of massively parallel devices like the graphics processing unit (GPU) is difficult for algorithms that show dynamic or inhomogeneous workloads. To achieve high performance, such advanced algorithms require scalable, concurrent queues to collect and distribute work. We present a new concurrent work queue, the Broker Queue, a highly efficient, linearizable queue for fine-granular work distribution on the GPU. We evaluate its usability and benefits in contrast to existing queuing algorithms. Our queue is up to one order of magnitude faster than non-blocking queues, and outperforms simpler queue designs that are unfit for fine-granular work distribution. © 2018 ACM.
引用
收藏
页码:401 / 402
页数:1
相关论文
共 50 条
  • [1] A Scalable Queue for Work Distribution on GPUs
    Kerbl, Bernhard
    Mueller, Joerg
    Kenzel, Michael
    Schmalstieg, Dieter
    Steinberger, Markus
    [J]. ACM SIGPLAN NOTICES, 2018, 53 (01) : 401 - 402
  • [2] Agile Queue: A Fast and Scalable Concurrent Queue on GPU
    Polak, Md Sabbir Hossain
    Troendle, David
    Jang, Byunghyun
    [J]. 53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 108 - 109
  • [3] Scalable b-Matching on GPUs
    Naim, Md
    Manne, Fredrik
    [J]. 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 637 - 646
  • [4] Scalable Programmable Motion Effects on GPUs
    Huang, Xuezhen
    Hou, Qiming
    Ren, Zhong
    Zhou, Kun
    [J]. COMPUTER GRAPHICS FORUM, 2012, 31 (07) : 2259 - 2266
  • [5] Scalable and Fast Lazy Persistency on GPUs
    Yudha, Ardhi Wiratama Baskara
    Kimura, Keiji
    Zhou, Huiyang
    Solihin, Yan
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020), 2020, : 252 - 263
  • [6] ezLDA: Efficient and Scalable LDA on GPUs
    Wang, Shilong
    Liu, Hang
    Gaihre, Anil
    Yu, Hengyong
    [J]. IEEE ACCESS, 2023, 11 : 100165 - 100179
  • [7] Scalable Prototype Learning Using GPUs
    Su, Tonghua
    Li, Songze
    Ma, Peijun
    Deng, Shengchun
    Liang, Guangsheng
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2014, PT I, 2014, 8814 : 309 - 319
  • [8] HQL: A Scalable Synchronization Mechanism for GPUs
    Yilmazer, Ayse
    Kaeli, David
    [J]. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 475 - 486
  • [9] The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU
    Kerbl, Bernhard
    Kenzel, Michael
    Mueller, Joerg H.
    Schmalstieg, Dieter
    Steinberger, Markus
    [J]. INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 76 - 85
  • [10] Efficient and Scalable k‑Means on GPUs
    Clemens Lutz
    Sebastian Breß
    Tilmann Rabl
    Steffen Zeuch
    Volker Markl
    [J]. Datenbank-Spektrum, 2018, 18 (3) : 157 - 169