A scalable queue for work distribution on GPUs

被引：0

作者：

Kerbl B. ^{[1
]}

Müller J. ^{[1
]}

Kenzel M. ^{[1
]}

Schmalstieg D. ^{[1
]}

Steinberger M. ^{[1
]}

机构：

[1] Kerbl, Bernhard

[2] Müller, Jörg

[3] Kenzel, Michael

[4] Schmalstieg, Dieter

[5] Steinberger, Markus

来源：

| 2018年 / Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States卷 / 53期

关键词：

concurrent; GPU; parallel; queuing; scheduling;

D O I：

10.1145/3178487.3178526

中图分类号：

学科分类号：

摘要：

Harnessing the power of massively parallel devices like the graphics processing unit (GPU) is difficult for algorithms that show dynamic or inhomogeneous workloads. To achieve high performance, such advanced algorithms require scalable, concurrent queues to collect and distribute work. We present a new concurrent work queue, the Broker Queue, a highly efficient, linearizable queue for fine-granular work distribution on the GPU. We evaluate its usability and benefits in contrast to existing queuing algorithms. Our queue is up to one order of magnitude faster than non-blocking queues, and outperforms simpler queue designs that are unfit for fine-granular work distribution. © 2018 ACM.

引用

页码：401 / 402

页数：1

共 50 条

[1] A Scalable Queue for Work Distribution on GPUs
Kerbl, Bernhard
Mueller, Joerg
Kenzel, Michael
Schmalstieg, Dieter
Steinberger, Markus
[J]. ACM SIGPLAN NOTICES, 2018, 53 (01) : 401 - 402
[2] Agile Queue: A Fast and Scalable Concurrent Queue on GPU
Polak, Md Sabbir Hossain
Troendle, David
Jang, Byunghyun
[J]. 53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 108 - 109
[3] Scalable b-Matching on GPUs
Naim, Md
Manne, Fredrik
[J]. 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 637 - 646
[4] Scalable Programmable Motion Effects on GPUs
Huang, Xuezhen
Hou, Qiming
Ren, Zhong
Zhou, Kun
[J]. COMPUTER GRAPHICS FORUM, 2012, 31 (07) : 2259 - 2266
[5] Scalable and Fast Lazy Persistency on GPUs
Yudha, Ardhi Wiratama Baskara
Kimura, Keiji
Zhou, Huiyang
Solihin, Yan
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020), 2020, : 252 - 263
[6] ezLDA: Efficient and Scalable LDA on GPUs
Wang, Shilong
Liu, Hang
Gaihre, Anil
Yu, Hengyong
[J]. IEEE ACCESS, 2023, 11 : 100165 - 100179
[7] Scalable Prototype Learning Using GPUs
Su, Tonghua
Li, Songze
Ma, Peijun
Deng, Shengchun
Liang, Guangsheng
[J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2014, PT I, 2014, 8814 : 309 - 319
[8] HQL: A Scalable Synchronization Mechanism for GPUs
Yilmazer, Ayse
Kaeli, David
[J]. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 475 - 486
[9] The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU
Kerbl, Bernhard
Kenzel, Michael
Mueller, Joerg H.
Schmalstieg, Dieter
Steinberger, Markus
[J]. INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 76 - 85
[10] Efficient and Scalable k‑Means on GPUs
Clemens Lutz
Sebastian Breß
Tilmann Rabl
Steffen Zeuch
Volker Markl
[J]. Datenbank-Spektrum, 2018, 18 (3) : 157 - 169

← 1 2 3 4 5 →