MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory Processing

被引:8
|
作者
Chen, Dan [1 ]
He, Haiheng [1 ]
Jin, Hai [1 ]
Zheng, Long [1 ]
Huang, Yu [1 ]
Shen, Xinyang [1 ]
Liao, Xiaofei [1 ]
机构
[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab,Sch Comp Sci & Technol, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Heterogeneous graph neural networks; cartesian product; near-memory processing;
D O I
10.1145/3579371.3589091
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Heterogeneous graph neural networks (HGNNs) based on metapath exhibit powerful capturing of rich structural and semantic information in the heterogeneous graph. HGNNs are highly memory-bound and thus can be accelerated by near-memory processing. However, they also suffer from significant memory footprint (due to storing metapath instances as intermediate data) and severe redundant computation (when vertex features are aggregated among metapath instances). To address these issues, this paper proposes MetaNMP, the first DIMM-based near-memory processing HGNNs accelerator with reduced memory footprint and high performance. Specifically, we first propose a cartesian-like product paradigm to generate all metapath instances on the fly for heterogeneous graphs. In this way, metapath instances no longer need to be stored as intermediate data, avoiding significant memory consumption. We then design a data flow for aggregating vertex features on metapath instances, which aggregates vertex features along the direction of the metapath instances dispersed from the starting vertex to exploit shareable aggregation computations, eliminating most of the redundant computations. Finally, we integrate specialized hardware units in DIMM to accelerate HGNNs with near-memory processing, and introduce a broadcast mechanism for edge data and vertex features to mitigate the inter-DIMM communication. Our evaluation shows that MetaNMP achieves the memory space reduction of 51.9% on average and the performance improvement by 415.18x compared to NVIDIA Tesla V100 GPU.
引用
收藏
页码:784 / 796
页数:13
相关论文
共 38 条
  • [1] GATe: Streamlining Memory Access and Communication to Accelerate Graph Attention Network With Near-Memory Processing
    Yi, Shiyan
    Qiu, Yudi
    Lu, Lingfei
    Xu, Guohao
    Gong, Yong
    Zeng, Xiaoyang
    Fan, Yibo
    IEEE COMPUTER ARCHITECTURE LETTERS, 2024, 23 (01) : 87 - 90
  • [2] Special Issue on Near-Memory and In-Memory Processing
    Pande, Partha Pratim
    IEEE DESIGN & TEST, 2022, 39 (02) : 4 - 4
  • [3] Near-Memory Processing Offload to Remote (Persistent) Memory
    Kisous, Roei
    Golander, Amit
    Korman, Yigal
    Gubner, Tim
    Humborstad, Rune
    Lu, Manyi
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, SYSTOR 2023, 2023, : 136 - 136
  • [4] Near-Memory Processing Offload to Remote (Persistent) Memory
    Kisous, Roei
    Golander, Amit
    Korman, Yigal
    Gubner, Tim
    Humborstad, Rune
    Lu, Manyi
    Proceedings of the 16th ACM International Conference on Systems and Storage, SYSTOR 2023, 2023,
  • [5] A Survey of Resource Management for Processing-In-Memory and Near-Memory Processing Architectures
    Khan, Kamil
    Pasricha, Sudeep
    Kim, Ryan Gary
    JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2020, 10 (04) : 1 - 31
  • [6] Scalable Phylogeny Reconstruction with Disaggregated Near-memory Processing
    Alachiotis, Nikolaos
    Skrimponis, Panagiotis
    Pissadakis, Manolis
    Pnevmatikatos, Dionisios
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (03)
  • [7] ACTS: A Near-Memory FPGA Graph Processing Framework
    Jaiyeoba, Wole
    Elyasi, Nima
    Choi, Changho
    Skadron, Kevin
    PROCEEDINGS OF THE 2023 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, FPGA 2023, 2023, : 79 - 89
  • [8] High-throughput Near-Memory Processing on CNNs with 3D HBM-like Memory
    Park, Naebeom
    Ryu, Sungju
    Kung, Jaeha
    Kim, Jae-Joon
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 26 (06)
  • [9] RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
    Ke, Liu
    Gupta, Udit
    Cho, Benjamin Youngjae
    Brooks, David
    Chandra, Vikas
    Diril, Utku
    Firoozshahian, Amin
    Hazelwood, Kim
    Jia, Bill
    Lee, Hsien-Hsin S.
    Li, Meng
    Maher, Bert
    Mudigere, Dheevatsa
    Naumov, Maxim
    Schatz, Martin
    Smelyanskiy, Mikhail
    Wang, Xiaodong
    Reagen, Brandon
    Wu, Carole-Jean
    Hempstead, Mark
    Zhang, Xuan
    2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), 2020, : 790 - 803
  • [10] Operand-Oriented Virtual Memory Support for Near-Memory Processing
    Choi, Duheon
    Jeong, Taeyang
    Yeom, Joonhyeok
    Chung, Eui-Young
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (08) : 2250 - 2263