Data Partitioning Strategies for Stencil Computations on NUMA Systems

被引：0

作者：

Feinbube, Frank ^{[1
]}

Plauth, Max ^{[1
]}

Knaust, Marius ^{[1
]}

Polze, Andreas ^{[1
]}

机构：

[1] Univ Potsdam, Hasso Plattner Inst Software Syst Engn, Operating Syst & Middleware Grp, Potsdam, Germany

来源：

EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS | 2018年 / 10659卷

关键词：

NUMA; Stencil computation; Data partitioning; OPTIMIZATION;

D O I：

10.1007/978-3-319-75178-8_48

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Many scientific problems rely on the efficient execution of stencil computations, which are usually memory-bound. In this paper, stencils on two-dimensional data are executed on NUMA architectures. Each node of a NUMA system processes a distinct partition of the input data independent from other nodes. However, processors may need access to the memory of other nodes at the edges of the partitions. This paper demonstrates two techniques based on machine learning for identifying partitioning strategies that reduce the occurrence of remote memory access. One approach is generally applicable and is based on an uninformed search. The second approach caps the search space by employing geometric decomposition. The partitioning strategies obtained with these techniques are analyzed theoretically. Finally, an evaluation on a real NUMA machine is conducted, which demonstrates that the expected reduction of the remote memory accesses can be achieved.

引用

页码：597 / 609

页数：13

共 50 条

[21] Tiling Stencil Computations to Maximize Parallelism
Bandishti, Vinayaka
Pananilath, Irshad
Bondhugula, Uday
[J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
[22] Resilience for Stencil Computations with Latent Errors
Fang, Aiman
Cavelan, Aurelien
Robert, Yves
Chien, Andrew A.
[J]. 2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 581 - 590
[23] Register Caching for Stencil Computations on GPUs
Falch, Thomas L.
Elster, Anne C.
[J]. 16TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2014), 2014, : 479 - 486
[24] OpenMP task scheduling strategies for multicore NUMA systems
Olivier, Stephen L.
Porterfield, Allan K.
Wheeler, Kyle B.
Spiegel, Michael
Prins, Jan F.
[J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2012, 26 (02): : 110 - 124
[25] Automatic Adaptive Approximation for Stencil Computations
Schmitt, Maxime
Helluy, Philippe
Bastoul, Cedric
[J]. PROCEEDINGS OF THE 28TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '19), 2019, : 170 - 181
[26] Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures
Henretty, Tom
Stock, Kevin
Pouchet, Louis-Noel
Franchetti, Franz
Ramanujam, J.
Sadayappan, P.
[J]. COMPILER CONSTRUCTION, 2011, 6601 : 225 - +
[27] RECTILINEAR PARTITIONING OF IRREGULAR DATA-PARALLEL COMPUTATIONS
NICOL, DM
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1994, 23 (02) : 119 - 134
[28] NUMA-BTDM: A thread mapping algorithm for balanced data locality on NUMA systems
Stirb, Iulia
[J]. 2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 317 - 320
[29] Parallel Data-Locality Aware Stencil Computations on Modern Micro-Architectures
Christen, Matthias
Schenk, Olaf
Neufeld, Esra
Messmer, Peter
Burkhart, Helmar
[J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 547 - +
[30] Double precision stencil computations on Kepler GPUs
Vizitiu, Anamaria
Itu, Lucian
Lazar, Laszlo
Suciu, Constantin
[J]. 2014 18TH INTERNATIONAL CONFERENCE SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2014, : 123 - 127

← 1 2 3 4 5 →