MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory Processing

被引：8

作者：

Chen, Dan ^{[1
]}

He, Haiheng ^{[1
]}

Jin, Hai ^{[1
]}

Zheng, Long ^{[1
]}

Huang, Yu ^{[1
]}

Shen, Xinyang ^{[1
]}

Liao, Xiaofei ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab,Sch Comp Sci & Technol, Wuhan, Peoples R China

来源：

PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Heterogeneous graph neural networks; cartesian product; near-memory processing;

D O I：

10.1145/3579371.3589091

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Heterogeneous graph neural networks (HGNNs) based on metapath exhibit powerful capturing of rich structural and semantic information in the heterogeneous graph. HGNNs are highly memory-bound and thus can be accelerated by near-memory processing. However, they also suffer from significant memory footprint (due to storing metapath instances as intermediate data) and severe redundant computation (when vertex features are aggregated among metapath instances). To address these issues, this paper proposes MetaNMP, the first DIMM-based near-memory processing HGNNs accelerator with reduced memory footprint and high performance. Specifically, we first propose a cartesian-like product paradigm to generate all metapath instances on the fly for heterogeneous graphs. In this way, metapath instances no longer need to be stored as intermediate data, avoiding significant memory consumption. We then design a data flow for aggregating vertex features on metapath instances, which aggregates vertex features along the direction of the metapath instances dispersed from the starting vertex to exploit shareable aggregation computations, eliminating most of the redundant computations. Finally, we integrate specialized hardware units in DIMM to accelerate HGNNs with near-memory processing, and introduce a broadcast mechanism for edge data and vertex features to mitigate the inter-DIMM communication. Our evaluation shows that MetaNMP achieves the memory space reduction of 51.9% on average and the performance improvement by 415.18x compared to NVIDIA Tesla V100 GPU.

引用

页码：784 / 796

页数：13

共 38 条

[31] GNNear: Accelerating Full-Batch Training of Graph Neural Networks with Near-Memory Processing
Zhou, Zhe
Li, Cong
Wei, Xuechao
Wang, Xiaoyang
Sun, Guangyu
PROCEEDINGS OF THE 2022 31ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2022, 2022, : 54 - 68
[32] G-NMP: Accelerating Graph Neural Networks with DIMM-based Near-Memory Processing
Tian, Teng
Wang, Xiaotian
Zhao, Letian
Wu, Wei
Zhang, Xuecang
Lu, Fangmin
Wang, Tianqi
Jin, Xi
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 129
[33] ABC-DIMM: Alleviating the Bottleneck of Communication in DIMM-based Near-Memory Processing with Inter-DIMM Broadcast
Sun, Weiyi
Li, Zhaoshi
Yin, Shouyi
Wei, Shaojun
Liu, Leibo
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 237 - 250
[34] SADIMM: Accelerating <underline>S</underline>parse <underline>A</underline>ttention Using <underline>DIMM</underline>-Based Near-Memory Processing
Li, Huize
Chen, Dan
Mitra, Tulika
IEEE TRANSACTIONS ON COMPUTERS, 2025, 74 (02) : 542 - 554
[35] TiPU: A Spatial-Locality-Aware Near-Memory Tile Processing Unit for 3D Point Cloud Neural Network
Zheng, Jiapei
Jiang, Hao
Nie, Xinkai
Huang, Zhangcheng
Chen, Chixiao
Liu, Qi
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[36] Guest Editorial IEEE Transactions on Emerging Topics in Computing Thematic Section on Memory- Centric Designs: Processing-in-Memory, In-Memory Computing, and Near-Memory Computing for Real-World Applications
Chang, Yuan-Hao
Piuri, Vincenzo
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (02) : 278 - 280
[37] Instant-NeRF: Instant On-Device Neural Radiance Field Training via Algorithm-Accelerator Co-Designed Near-Memory Processing
Zhao, Yang
Wu, Shang
Zhang, Jingqun
Li, Sixu
Li, Chaolian
Lin, Yingyan
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[38] A 1-16b Reconfigurable 80Kb 7T SRAM-Based Digital Near-Memory Computing Macro for Processing Neural Networks
Kim, Hyunjoon
Mu, Junjie
Yu, Chengshuo
Kim, Tony Tae-Hyoung
Kim, Bongjin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (04) : 1580 - 1590

← 1 2 3 4 →