Automatic Partitioning of Stencil Computations on Heterogeneous Systems

被引:4
|
作者
Pereira, Alyson D. [1 ]
Rocha, Rodrigo C. O. [4 ]
Ramos, Luiz [3 ]
Castro, Marcio [1 ]
Goes, Luis F. W. [2 ]
机构
[1] Univ Fed Santa Catarina, Florianopolis, SC, Brazil
[2] Pontificia Univ Catolica Minas Gerais, Belo Horizonte, MG, Brazil
[3] Univ Estadual Campinas, Campinas, SP, Brazil
[4] Univ Edinburgh, Edinburgh, Midlothian, Scotland
关键词
Stencil; Work Partitioning; Decision Tree Learning;
D O I
10.1109/SBAC-PADW.2017.16
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The stencil pattern is important in many scientific and engineering domains, spurring great interest from researchers and industry. In recent years, various optimizations have been proposed for parallel stencil applications running on GPUs. However, most of the runtime systems that execute those applications often fail to fully utilize the parallelism of modern heterogeneous systems. In this paper, we propose a mechanism based on machine learning that automatically partitions stencil computations across CPU and GPU. We implemented it into the PSkel framework and found that the mechanism can boost the performance of stencil applications on average by 17.9x compared to their sequential CPU-only counterparts, by 1.34x compared to a GPU-only version, and by 1.48x compared to a parallel CPU-only version.
引用
收藏
页码:43 / 48
页数:6
相关论文
共 50 条
  • [1] Data Partitioning Strategies for Stencil Computations on NUMA Systems
    Feinbube, Frank
    Plauth, Max
    Knaust, Marius
    Polze, Andreas
    [J]. EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 597 - 609
  • [2] Effective Automatic Parallelization of Stencil Computations
    Krishnamoorthy, Sriram
    Baskaran, Muthu
    Bondhugula, Uday
    Ramanujam, J.
    Rountev, Atanas
    Sadayappan, P.
    [J]. PLDI'07: PROCEEDINGS OF THE 2007 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2007, : 235 - 244
  • [3] Effective automatic parallelization of stencil computations
    Krishnamoorthy, Sriram
    Baskaran, Muthu
    Bondhugula, Uday
    Ramanujam, J.
    Rountev, Atanas
    Sadayappan, P.
    [J]. ACM SIGPLAN NOTICES, 2007, 42 (06) : 235 - 244
  • [4] Automatic Adaptive Approximation for Stencil Computations
    Schmitt, Maxime
    Helluy, Philippe
    Bastoul, Cedric
    [J]. PROCEEDINGS OF THE 28TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '19), 2019, : 170 - 181
  • [5] Automatic Performance Tuning of Stencil Computations on GPUs
    Garvey, Joseph D.
    Abdelrahman, Tarek S.
    [J]. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 300 - 309
  • [6] TOAST: Automatic tiling for iterative stencil computations on GPUs
    Rocha, Rodrigo C. O.
    Pereira, Alyson D.
    Ramos, Luiz
    Goes, Luis F. W.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (08):
  • [7] A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
    Garvey, Joseph D.
    Abdelrahman, Tarek S.
    [J]. SCIENTIFIC PROGRAMMING, 2018, 2018
  • [8] Unleashing the performance of ccNUMA multiprocessor architectures in heterogeneous stencil computations
    Szustak, Lukasz
    Halbiniak, Kamil
    Wyrzykowski, Roman
    Jakl, Ondrej
    [J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (12): : 7765 - 7777
  • [9] Unleashing the performance of ccNUMA multiprocessor architectures in heterogeneous stencil computations
    Lukasz Szustak
    Kamil Halbiniak
    Roman Wyrzykowski
    Ondřej Jakl
    [J]. The Journal of Supercomputing, 2019, 75 : 7765 - 7777
  • [10] A FRAMEWORK FOR PARTITIONING PARALLEL COMPUTATIONS IN HETEROGENEOUS ENVIRONMENTS
    WEISSMAN, JB
    GRIMSHAW, AS
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 1995, 7 (05): : 455 - 478