Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

被引：0

作者：

Neth, Brandon ^{[1
]}

Scogland, Thomas R. W. ^{[2
]}

Duran, Alejandro ^{[3
]}

de Supinski, Bronis R. ^{[2
]}

机构：

[1] Univ Arizona, Tucson, AZ 85721 USA

[2] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA

[3] Intel Corp, Iberia, Spain

来源：

OPENMP: ENABLING MASSIVE NODE-LEVEL PARALLELISM, IWOMP 2021 | 2021年 / 12870卷

关键词：

D O I：

10.1007/978-3-030-85262-7_13

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

OpenMP began supporting offloading in version 4.0, almost 10 years ago. It introduced the programming model for offload to GPUs or other accelerators that was common at the time, requiring users to explicitly transfer data between host and devices. But advances in heterogeneous computing and programming systems have created a new environment. No longer are programmers required to manage tracking and moving their data on their own. Now, for those who want it, inter-device address mapping and other runtime systems push these data management tasks behind a veil of abstraction. In the context of this progress, OpenMP offloading support shows signs of its age. However, because of its ubiquity as a standard for portable, parallel code, OpenMP is well positioned to provide a similar standard for heterogeneous programming. Towards this goal, we review the features available in other programming systems and argue that OpenMP expand its offloading support to better meet the expectations of modern programmers The first step, detailed here, augments OpenMP's existing memory space abstraction with device awareness and a concept of shared and managed memory. Thus, users can allocate memory accessible to different combinations of devices that do not require explicit memory transfers. We show the potential performance impact of this feature and discuss the possible downsides.

引用

页码：183 / 194

页数：12

共 50 条

[21] The Omni OpenMP compiler on the distributed shared memory of Cenju-4
Kusano, K
Sato, M
Hosomi, T
Seo, Y
OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2001, 2104 : 20 - 30
[22] Optimization of OpenMP Offload Shared Memory Access for Domestic Heterogeneous Platforms
Wang, Xin
Li, Jianan
Han, Lin
Zhao, Rongcai
Zhou, Qiangwei
Computer Engineering and Applications, 2023, 59 (10) : 75 - 85
[23] Evaluation of SMP Shared Memory Machines for Use With In-Memory and OpenMP Big Data Applications
Younge, Andrew J.
Reidy, Christopher
Henschel, Robert
Fox, Geoffrey C.
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1597 - 1606
[24] Explicit TCP window adaptation in a shared memory architecture
Aweya, J
Ouellette, M
Montuno, DY
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2003, 16 (04) : 337 - 356
[25] OpenMP: Parallel programming API for shared memory multiprocessors and on-chip multiprocessors
Sato, M
ISSS'02: 15TH INTERNATIONAL SYMPOSIUM ON SYSTEM SYNTHESIS, 2002, : 109 - 111
[26] Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems
Chapman, B
Bregier, F
Patil, A
Prabhakar, A
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2002, 14 (8-9): : 713 - 739
[27] The Design of MPI Based Distributed Shared Memory Systems to Support OpenMP on Clusters
Wong, H'sien J.
Rendell, A. P.
2007 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2007, : 231 - 240
[28] UPMLIB:: A runtime system for Turning the memory performance of OpenMP programs on scalable shared-memory multiprocessors
Nikolopoulos, DS
Papatheodorou, TS
Polychronopoulos, CD
Labarta, J
Ayguadé, E
LANGUAGES, COMPILERS, AND RUN-TIME SYSTEMS FOR SCALABLE COMPUTERS, 2000, 1915 : 85 - 99
[29] Characterization of OpenMP applications on the InfiniBand-based distributed virtual shared memory system
Park, I
Kim, SW
Park, K
HIGH PERFORMANCE COMPUTING - HIPC 2004, 2004, 3296 : 430 - 439
[30] Automatic Performance Analysis of OpenMP Codes on a Scalable Shared Memory System Using Periscope
Benedict, Shajulin
Gerndt, Michael
APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II, 2012, 7134 : 452 - 462

← 1 2 3 4 5 →