Automatic Partitioning of Stencil Computations on Heterogeneous Systems

被引:4
|
作者
Pereira, Alyson D. [1 ]
Rocha, Rodrigo C. O. [4 ]
Ramos, Luiz [3 ]
Castro, Marcio [1 ]
Goes, Luis F. W. [2 ]
机构
[1] Univ Fed Santa Catarina, Florianopolis, SC, Brazil
[2] Pontificia Univ Catolica Minas Gerais, Belo Horizonte, MG, Brazil
[3] Univ Estadual Campinas, Campinas, SP, Brazil
[4] Univ Edinburgh, Edinburgh, Midlothian, Scotland
关键词
Stencil; Work Partitioning; Decision Tree Learning;
D O I
10.1109/SBAC-PADW.2017.16
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The stencil pattern is important in many scientific and engineering domains, spurring great interest from researchers and industry. In recent years, various optimizations have been proposed for parallel stencil applications running on GPUs. However, most of the runtime systems that execute those applications often fail to fully utilize the parallelism of modern heterogeneous systems. In this paper, we propose a mechanism based on machine learning that automatically partitions stencil computations across CPU and GPU. We implemented it into the PSkel framework and found that the mechanism can boost the performance of stencil applications on average by 17.9x compared to their sequential CPU-only counterparts, by 1.34x compared to a GPU-only version, and by 1.48x compared to a parallel CPU-only version.
引用
收藏
页码:43 / 48
页数:6
相关论文
共 50 条
  • [31] Autotuning divide-and-conquer stencil computations
    Natarajan, Ekanathan Palamadai
    Dehnavi, Maryam Mehri
    Leiserson, Charles
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (17):
  • [32] Parameterized Diamond Tiling for Parallelizing Stencil Computations
    Wijesinghe, T.
    Senevirathne, K.
    Siriwardhana, C.
    Visitha, W.
    Jayasena, S.
    Rusira, T.
    Hall, M.
    [J]. 2017 3RD INTERNATIONAL MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON), 2017, : 99 - 104
  • [33] Automatic coarse-grain partitioning and automatic code generation for heterogeneous architectures
    Raulet, M
    Babel, M
    Déforges, O
    Nezan, JF
    Sorel, Y
    [J]. SIPS 2003: IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2003, : 316 - 321
  • [34] The memory behavior of cache oblivious stencil computations
    Frigo, Matteo
    Strumpen, Volker
    [J]. JOURNAL OF SUPERCOMPUTING, 2007, 39 (02): : 93 - 112
  • [35] Speeding Up Stencil Computations with Kernel Convolution
    Januario, Guilherme C.
    Rosenburg, Bryan S.
    Park, Yoonho
    Perrone, Michael
    Moreira, Jose
    Carvalho, Tereza C. M. B.
    [J]. PROCEEDINGS OF 28TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, (SBAC-PAD 2016), 2016, : 76 - 83
  • [36] Optimal image partitioning in heterogeneous computing systems
    Zeng, ZY
    Lu, XD
    [J]. ELECTRONICS LETTERS, 2002, 38 (18) : 1023 - 1023
  • [37] A Data Partitioning Model for Highly Heterogeneous Systems
    Tabik, S.
    Ortega, G.
    Garzon, E. M.
    Suarez, D.
    [J]. EURO-PAR 2016: PARALLEL PROCESSING WORKSHOPS, 2017, 10104 : 468 - 479
  • [38] Strategy for data-flow synchronizations in stencil parallel computations on multi-/manycore systems
    Szustak, Lukasz
    [J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (04): : 1534 - 1546
  • [39] Multi-Personality Partitioning for Heterogeneous Systems
    Gregerson, Anthony
    Chadha, Aman
    Morrow, Katherine
    [J]. PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2013, : 314 - 317
  • [40] Strategy for data-flow synchronizations in stencil parallel computations on multi-/manycore systems
    Lukasz Szustak
    [J]. The Journal of Supercomputing, 2018, 74 : 1534 - 1546