GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems

被引:4
|
作者
Ino, Fumihiko [1 ]
Nakagawa, Shinta [2 ]
Hagihara, Kenichi [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Suita, Osaka 5650871, Japan
[2] NEC Corp Ltd, Storage Div, Fuchu, Tokyo 1838501, Japan
来源
关键词
stream processing; GPGPU; CUDA; task scheduling; GRAPHICS;
D O I
10.1587/transinf.E96.D.2604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a stream programming framework, named GPU-chariot, for accelerating stream applications running on graphics processing units (GPUs). The main contribution of our framework is that it realizes efficient software pipelines on multi-GPU systems by enabling out-of-order execution of CPU functions, kernels, and data transfers. To achieve this out-of-order execution, we apply a runtime scheduler that not only maximizes the utilization of system resources but also encapsulates the number of GPUs available in the system. In addition, we implement a load-balancing capability to flow data efficiently through multiple GPUs. Furthermore, a callback interface enables overlapping execution of functions in third-party libraries. By using kernels with different performance bottlenecks, we show that our out-of-order execution is up to 20% faster than in-order execution. Finally, we conduct several case studies on a 4-GPU system and demonstrate the advantages of GPU-chariot over a manually pipelined code. We conclude that GPU-chariot can be useful when developing stream applications with software pipelines on multiple GPUs and CPUs.
引用
收藏
页码:2604 / 2616
页数:13
相关论文
共 50 条
  • [1] Benchmarking multi-GPU applications on modern multi-GPU integrated systems
    Bernaschi, Massimo
    Agostini, Elena
    Rossetti, Davide
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (14):
  • [2] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
    Lima, Joao V. F.
    Di Domenico, Daniel
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36
  • [3] Accelerating MapReduce framework on multi-GPU systems
    Jiang, Hai
    Chen, Yi
    Qiao, Zhi
    Li, Kuan-Ching
    Ro, WonWoo
    Gaudiot, Jean-Luc
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (02): : 293 - 301
  • [4] Accelerating MapReduce framework on multi-GPU systems
    Hai Jiang
    Yi Chen
    Zhi Qiao
    Kuan-Ching Li
    WonWoo Ro
    Jean-Luc Gaudiot
    [J]. Cluster Computing, 2014, 17 : 293 - 301
  • [5] Scalable Framework for Mapping Streaming Applications onto Multi-GPU Systems
    Huynh, Huynh Phung
    Hagiescu, Andrei
    Wong, Weng-Fai
    Goh, Rick Siow Mong
    [J]. ACM SIGPLAN NOTICES, 2012, 47 (08) : 1 - 10
  • [6] HPSM: a programming framework to exploit multi-CPU and multi-GPU systems simultaneously
    Ferreira Lima, Joao Vicente
    Di Domenico, Daniel
    [J]. INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2019, 10 (03) : 201 - 211
  • [7] An introduction to multi-GPU programming for physicists
    Bernaschi, M.
    Bisson, M.
    Fatica, M.
    Phillips, E.
    [J]. EUROPEAN PHYSICAL JOURNAL-SPECIAL TOPICS, 2012, 210 (01): : 17 - 31
  • [8] An introduction to multi-GPU programming for physicists
    M. Bernaschi
    M. Bisson
    M. Fatica
    E. Phillips
    [J]. The European Physical Journal Special Topics, 2012, 210 : 17 - 31
  • [9] Modelling Multi-GPU Systems
    Spampinato, Daniele G.
    Elster, Anne C.
    Natvig, Thorvald
    [J]. PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 562 - 569
  • [10] Introducing and Implementing the Allpairs Skeleton for Programming Multi-GPU Systems
    Michel Steuwer
    Malte Friese
    Sebastian Albers
    Sergei Gorlatch
    [J]. International Journal of Parallel Programming, 2014, 42 : 601 - 618