Optimizing non-coalesced memory access for irregular applications with GPU computing

被引:0
|
作者
Ran Zheng
Yuan-dong Liu
Hai Jin
机构
[1] Huazhong University of Science and Technology,National Engineering Research Center for Big Data Technology and System
[2] Huazhong University of Science and Technology,Services Computing Technology and System Lab
[3] Huazhong University of Science and Technology,Cluster and Grid Computing Lab
[4] Huazhong University of Science and Technology,School of Computer Science and Technology
关键词
General purpose graphics processing units; Memory coalescing; Non-coalesced memory access; Data reordering; TP319;
D O I
暂无
中图分类号
学科分类号
摘要
General purpose graphics processing units (GPGPUs) can be used to improve computing performance considerably for regular applications. However, irregular memory access exists in many applications, and the benefits of graphics processing units (GPUs) are less substantial for irregular applications. In recent years, several studies have presented some solutions to remove static irregular memory access. However, eliminating dynamic irregular memory access with software remains a serious challenge. A pure software solution without hardware extensions or offline profiling is proposed to eliminate dynamic irregular memory access, especially for indirect memory access. Data reordering and index redirection are suggested to reduce the number of memory transactions, thereby improving the performance of GPU kernels. To improve the efficiency of data reordering, an operation to reorder data is offloaded to a GPU to reduce overhead and thus transfer data. Through concurrently executing the compute unified device architecture (CUDA) streams of data reordering and the data processing kernel, the overhead of data reordering can be reduced. After these optimizations, the volume of memory transactions can be reduced by 16.7%–50% compared with CUSPARSE-based benchmarks, and the performance of irregular kernels can be improved by 9.64%–34.9% using an NVIDIA Tesla P4 GPU.
引用
收藏
页码:1285 / 1301
页数:16
相关论文
共 35 条
  • [31] A 4-Mb Non-Volatile Chalcogenide Random Access Memory Designed for Space Applications: Project Status Update
    Rodgers, John
    Maimon, Jonathan
    Storey, Thomas
    Lee, David
    Graziano, Michael
    Rockett, Leonard
    Hunt, Kenneth
    2008 9TH ANNUAL NON-VOLATILE MEMORY TECHNOLOGY SYMPOSIUM, PROCEEDINGS, 2008, : 1 - +
  • [32] Analog Resistive Switching in Reduced Graphene Oxide and Chitosan-Based Bio-Resistive Random Access Memory Device for Neuromorphic Computing Applications
    Jetty, Prabana
    Sahu, Dwipak Prasad
    Jammalamadaka, Suryanarayana
    PHYSICA STATUS SOLIDI-RAPID RESEARCH LETTERS, 2022, 16 (02):
  • [33] Magnetic Random Access Memory based non-volatile asynchronous Muller cell for ultra-low power autonomous applications
    Di Pendina, G.
    Zianbetov, E.
    Beigne, E.
    JOURNAL OF APPLIED PHYSICS, 2015, 117 (17)
  • [34] Anti-bacterial and transparent allantoin biomaterial-based biocomposite for non-volatile memory and brain-inspired computing applications
    Pustake, Sneha O.
    Kumbhar, Dhananjay D.
    Park, Jun Hong
    Sonawane, Kailas D.
    Kamat, Rajanish K.
    Dandge, Padma B.
    Dongale, Tukaram D.
    MATERIALS LETTERS, 2023, 330
  • [35] Non-volatile switching characteristics in wet-deposited Ag2Se/GeSe double layers for resistive random access memory applications
    Nam, Ki-Hyun
    Kim, Jang-Han
    Cho, Won-Ju
    Chung, Hong-Bay
    APPLIED PHYSICS LETTERS, 2013, 102 (19)