Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads

被引：0

作者：

Albert Segura

Jose Maria Arnau

Antonio Gonzalez

机构：

[1] Universitat Politècnica de Catalunya (UPC),Departament d’Arquitectura de Computadors

来源：

The Journal of Supercomputing | 2023年 / 79卷

关键词：

GPGPU; Graph processing; Parallel architectures; Computer architecture;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

GPGPU architectures have become the dominant platform for massively parallel workloads, delivering high performance and energy efficiency for popular applications such as machine learning, computer vision or self-driving cars. However, irregular applications, such as graph processing, fail to fully exploit GPGPU resources due to their divergent memory accesses that saturate the memory hierarchy. To reduce the pressure on the memory subsystem for divergent memory-intensive applications, programmers must take into account SIMT execution model and memory coalescing in GPGPUs, devoting significant efforts in complex optimization techniques. Despite these efforts, we show that irregular graph processing still suffers from low GPGPU performance. We observe that in many irregular applications the mapping of data to threads can be safely changed. In other words, it is possible to relax the strict relationship between thread and data processed to reduce memory divergence. Based on this observation, we propose the Irregular accesses Reorder Unit (IRU), a novel hardware extension tightly integrated in the GPGPU pipeline. The IRU reorders data processed by the threads on irregular accesses to improve memory coalescing, i.e., it tries to assign data elements to threads as to produce coalesced accesses in SIMT groups. Furthermore, the IRU is capable of filtering and merging duplicated accesses, significantly reducing the workload. Programmers can easily utilize the IRU with a simple API, or let the compiler issue instructions from our extended ISA. We evaluate our proposal for state-of-the-art graph-based algorithms and a wide selection of applications. Results show that the IRU achieves a memory coalescing improvement of 1.32x and a 46% reduction in the overall traffic in the memory hierarchy, which results in 1.33x speedup and 13% energy savings on average, while incurring in a small 5.6% area overhead.

引用

页码：762 / 787

页数：25

共 50 条

[41] Action Unit recognition in still images using graph-based feature selection
Sechkova, Teodora
Tonchev, Krasimir
Manolova, Agata
[J]. 2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 646 - 650
[42] Unit Disk Graph-Based Node Similarity Index for Complex Network Analysis
Meghanathan, Natarajan
[J]. COMPLEXITY, 2019,
[43] Improving Comprehensibility of Event-Driven Microservice Architectures by Graph-Based Visualizations
Schoop, Sven
Hebisch, Erik
Franz, Thomas
[J]. SOFTWARE ARCHITECTURE, ECSA 2024, 2024, 14889 : 207 - 214
[44] Improving graph-based label propagation algorithm with group partition for fraud detection
Jiahui Wang
Yi Guo
Xinxiu Wen
Zhihong Wang
Zhen Li
Minwei Tang
[J]. Applied Intelligence, 2020, 50 : 3291 - 3300
[45] Graph-Based Symbolic Technique for Improving Sensitivity Analysis in Analog Integrated Circuits
Tlelo, E.
Rodriguez, S.
[J]. IEEE LATIN AMERICA TRANSACTIONS, 2014, 12 (05) : 871 - 876
[46] Instance selection method for improving graph-based semi-supervised learning
Wang, Hai
Wang, Shao-Bo
Li, Yu-Feng
[J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (04) : 725 - 735
[47] Improving Graph-Based Image Segmentation Using Nonlinear Color Similarity Metrics
Carvalho, L. E.
Neto, S. L. Mantelli
Sobieranski, A. C.
Comunello, E.
von Wangenheim, A.
[J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2015, 15 (04)
[48] Improving graph-based label propagation algorithm with group partition for fraud detection
Wang, Jiahui
Guo, Yi
Wen, Xinxiu
Wang, Zhihong
Li, Zhen
Tang, Minwei
[J]. APPLIED INTELLIGENCE, 2020, 50 (10) : 3291 - 3300
[49] Instance Selection Method for Improving Graph-Based Semi-supervised Learning
Wang, Hai
Wang, Shao-Bo
Li, Yu-Feng
[J]. PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 565 - 573
[50] Improving graph-based OCT segmentation for severe pathology in Retinitis Pigmentosa patients
Lang, Andrew
Carass, Aaron
Bittner, Ava K.
Ying, Howard S.
Prince, Jerry L.
[J]. MEDICAL IMAGING 2017: BIOMEDICAL APPLICATIONS IN MOLECULAR, STRUCTURAL, AND FUNCTIONAL IMAGING, 2017, 10137

← 1 2 3 4 5 →