Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads

被引:0
|
作者
Albert Segura
Jose Maria Arnau
Antonio Gonzalez
机构
[1] Universitat Politècnica de Catalunya (UPC),Departament d’Arquitectura de Computadors
来源
关键词
GPGPU; Graph processing; Parallel architectures; Computer architecture;
D O I
暂无
中图分类号
学科分类号
摘要
GPGPU architectures have become the dominant platform for massively parallel workloads, delivering high performance and energy efficiency for popular applications such as machine learning, computer vision or self-driving cars. However, irregular applications, such as graph processing, fail to fully exploit GPGPU resources due to their divergent memory accesses that saturate the memory hierarchy. To reduce the pressure on the memory subsystem for divergent memory-intensive applications, programmers must take into account SIMT execution model and memory coalescing in GPGPUs, devoting significant efforts in complex optimization techniques. Despite these efforts, we show that irregular graph processing still suffers from low GPGPU performance. We observe that in many irregular applications the mapping of data to threads can be safely changed. In other words, it is possible to relax the strict relationship between thread and data processed to reduce memory divergence. Based on this observation, we propose the Irregular accesses Reorder Unit (IRU), a novel hardware extension tightly integrated in the GPGPU pipeline. The IRU reorders data processed by the threads on irregular accesses to improve memory coalescing, i.e., it tries to assign data elements to threads as to produce coalesced accesses in SIMT groups. Furthermore, the IRU is capable of filtering and merging duplicated accesses, significantly reducing the workload. Programmers can easily utilize the IRU with a simple API, or let the compiler issue instructions from our extended ISA. We evaluate our proposal for state-of-the-art graph-based algorithms and a wide selection of applications. Results show that the IRU achieves a memory coalescing improvement of 1.32x and a 46% reduction in the overall traffic in the memory hierarchy, which results in 1.33x speedup and 13% energy savings on average, while incurring in a small 5.6% area overhead.
引用
收藏
页码:762 / 787
页数:25
相关论文
共 50 条
  • [41] Action Unit recognition in still images using graph-based feature selection
    Sechkova, Teodora
    Tonchev, Krasimir
    Manolova, Agata
    [J]. 2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 646 - 650
  • [42] Unit Disk Graph-Based Node Similarity Index for Complex Network Analysis
    Meghanathan, Natarajan
    [J]. COMPLEXITY, 2019,
  • [43] Improving Comprehensibility of Event-Driven Microservice Architectures by Graph-Based Visualizations
    Schoop, Sven
    Hebisch, Erik
    Franz, Thomas
    [J]. SOFTWARE ARCHITECTURE, ECSA 2024, 2024, 14889 : 207 - 214
  • [44] Improving graph-based label propagation algorithm with group partition for fraud detection
    Jiahui Wang
    Yi Guo
    Xinxiu Wen
    Zhihong Wang
    Zhen Li
    Minwei Tang
    [J]. Applied Intelligence, 2020, 50 : 3291 - 3300
  • [45] Graph-Based Symbolic Technique for Improving Sensitivity Analysis in Analog Integrated Circuits
    Tlelo, E.
    Rodriguez, S.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2014, 12 (05) : 871 - 876
  • [46] Instance selection method for improving graph-based semi-supervised learning
    Wang, Hai
    Wang, Shao-Bo
    Li, Yu-Feng
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (04) : 725 - 735
  • [47] Improving Graph-Based Image Segmentation Using Nonlinear Color Similarity Metrics
    Carvalho, L. E.
    Neto, S. L. Mantelli
    Sobieranski, A. C.
    Comunello, E.
    von Wangenheim, A.
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2015, 15 (04)
  • [48] Improving graph-based label propagation algorithm with group partition for fraud detection
    Wang, Jiahui
    Guo, Yi
    Wen, Xinxiu
    Wang, Zhihong
    Li, Zhen
    Tang, Minwei
    [J]. APPLIED INTELLIGENCE, 2020, 50 (10) : 3291 - 3300
  • [49] Instance Selection Method for Improving Graph-Based Semi-supervised Learning
    Wang, Hai
    Wang, Shao-Bo
    Li, Yu-Feng
    [J]. PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 565 - 573
  • [50] Improving graph-based OCT segmentation for severe pathology in Retinitis Pigmentosa patients
    Lang, Andrew
    Carass, Aaron
    Bittner, Ava K.
    Ying, Howard S.
    Prince, Jerry L.
    [J]. MEDICAL IMAGING 2017: BIOMEDICAL APPLICATIONS IN MOLECULAR, STRUCTURAL, AND FUNCTIONAL IMAGING, 2017, 10137