A Fine-grained Asynchronous Bulk Synchronous parallelism model for PGAS applications

被引：3

作者：

Paul, Sri Raj ^{[1
]}

Hayashi, Akihiro ^{[2
]}

Chen, Kun ^{[6
]}

Elmougy, Youssef ^{[3
]}

Sarkar, Vivek ^{[4
,5
]}

机构：

[1] Intel Corp, Austin, TX 78746 USA

[2] Georgia Inst Technol, Atlanta, GA USA

[3] Georgia Inst Technol, Comp Sci, Atlanta, GA USA

[4] Georgia Inst Technol, Coll Comp, Sch Comp Sci, Atlanta, GA USA

[5] Georgia Inst Technol, Coll Comp, Telecommun, Atlanta, GA USA

[6] Meta, Menlo Pk, CA USA

来源：

JOURNAL OF COMPUTATIONAL SCIENCE | 2023年 / 69卷

关键词：

Actors; Communication aggregation; Conveyors; Bale; PGAS; OpenSHMEM; Selectors; Irregular applications; Large scale graph analytics;

D O I：

10.1016/j.jocs.2023.102014

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The Partitioned Global Address Space (PGAS) model is well suited for executing irregular applications on cluster-based systems, due to its efficient support for short, one-sided messages. Separately, the actor model has been gaining popularity as a productive asynchronous message-passing approach for distributed objects in enterprise and cloud computing platforms, typically implemented in languages such as Erlang, Scala or Rust. To the best of our knowledge, there has been no past work on using the actor model to deliver both productivity and scalability to irregular PGAS applications with large number of small messages.In this paper, we introduce a new programming system for PGAS applications, in which point-to-point remote operations can be expressed as fine-grained asynchronous actor messages. In our approach, the programmer does not need to worry about programming complexities related to message aggregation and termination detection. Our approach can be viewed as extending the classical Bulk Synchronous Parallelism model with fine-grained asynchronous communications within a phase or superstep. We believe that our approach offers a desirable point in the productivity-performance space for PGAS applications, with more scalable performance and higher productivity relative to past approaches. Specifically, for seven irregular mini-applications from the Bale Kernels and three graph kernels executed using 2048 cores in the NERSC Cori system, our approach shows geometric mean performance improvements of >= symbolscript relative to standard PGAS versions (UPC and OpenSHMEM) while maintaining comparable productivity to those versions.This is an extended version of the conference paper "A Productive and Scalable Actor-Based Programming System for PGAS Applications"(Paul et al., 2022)[1] from ICCS 2022.

引用

页数：13

共 50 条

[1] Evaluation of Fine-grained Parallelism in AUTOSAR Applications
Stegmeier, Alexander
Kehr, Sebastian
George, Dave
Bradatsch, Christian
Panic, Milos
Bodekker, Bert
Ungerer, Theo
[J]. INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION (SAMOS 2017), 2017, : 121 - 128
[2] FINE-GRAINED PARALLELISM IN ELLIE
ANDERSEN, B
[J]. JOURNAL OF OBJECT-ORIENTED PROGRAMMING, 1992, 5 (03): : 55 - 61
[3] Fine-grained parallelism in computational mathematics
Bandman, OL
[J]. PROGRAMMING AND COMPUTER SOFTWARE, 2001, 27 (04) : 170 - 182
[4] Fine-Grained Parallelism in Computational Mathematics
O. L. Bandman
[J]. Programming and Computer Software, 2001, 27 : 170 - 182
[5] A MATCHING APPROACH TO UTILIZING FINE-GRAINED PARALLELISM
GUPTA, R
SOFFA, ML
[J]. PROCEEDINGS OF THE TWENTY-FIRST, ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOLS 1-4: ARCHITECTURE TRACK, SOFTWARE TRACK, DECISION SUPPORT AND KNOWLEDGE BASED SYSTEMS TRACK, APPLICATIONS TRACK, 1988, : 148 - 156
[6] Exploiting Fine-Grained Parallelism on Cell Processors
Hoffmann, Ralf
Prell, Andreas
Rauber, Thomas
[J]. EURO-PAR 2010 - PARALLEL PROCESSING, PART II, 2010, 6272 : 175 - 186
[7] Graph Analytics Through Fine-Grained Parallelism
Shang, Zechao
Li, Feifei
Yu, Jeffrey Xu
Zhang, Zhiwei
Cheng, Hong
[J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 463 - 478
[8] Implementation of Algorithms with a Fine-Grained Parallelism on GPUs
Kalgin, K. V.
[J]. NUMERICAL ANALYSIS AND APPLICATIONS, 2011, 4 (01) : 46 - 55
[9] BEHAVIOR OF FINE-GRAINED BULK SOLIDS
MOLERUS, O
[J]. CHEMIE INGENIEUR TECHNIK, 1993, 65 (06) : 710 - 718
[10] Accelerating RSA with Fine-Grained Parallelism Using GPU
Yang, Yang
Guan, Zhi
Sun, Huiping
Chen, Zhong
[J]. INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2015, 2015, 9065 : 454 - 468

← 1 2 3 4 5 →