MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory Processing

被引：8

作者：

Chen, Dan ^{[1
]}

He, Haiheng ^{[1
]}

Jin, Hai ^{[1
]}

Zheng, Long ^{[1
]}

Huang, Yu ^{[1
]}

Shen, Xinyang ^{[1
]}

Liao, Xiaofei ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab,Sch Comp Sci & Technol, Wuhan, Peoples R China

来源：

PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Heterogeneous graph neural networks; cartesian product; near-memory processing;

D O I：

10.1145/3579371.3589091

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Heterogeneous graph neural networks (HGNNs) based on metapath exhibit powerful capturing of rich structural and semantic information in the heterogeneous graph. HGNNs are highly memory-bound and thus can be accelerated by near-memory processing. However, they also suffer from significant memory footprint (due to storing metapath instances as intermediate data) and severe redundant computation (when vertex features are aggregated among metapath instances). To address these issues, this paper proposes MetaNMP, the first DIMM-based near-memory processing HGNNs accelerator with reduced memory footprint and high performance. Specifically, we first propose a cartesian-like product paradigm to generate all metapath instances on the fly for heterogeneous graphs. In this way, metapath instances no longer need to be stored as intermediate data, avoiding significant memory consumption. We then design a data flow for aggregating vertex features on metapath instances, which aggregates vertex features along the direction of the metapath instances dispersed from the starting vertex to exploit shareable aggregation computations, eliminating most of the redundant computations. Finally, we integrate specialized hardware units in DIMM to accelerate HGNNs with near-memory processing, and introduce a broadcast mechanism for edge data and vertex features to mitigate the inter-DIMM communication. Our evaluation shows that MetaNMP achieves the memory space reduction of 51.9% on average and the performance improvement by 415.18x compared to NVIDIA Tesla V100 GPU.

引用

页码：784 / 796

页数：13

共 38 条

[21] SuperCut: Communication-Aware Partitioning for Near-Memory Graph Processing
Zhao, Chenfeng
Chamberlain, Roger D.
Zhang, Xuan
PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2023, CF 2023, 2023, : 42 - 51
[22] Towards Accelerating k-NN with MPI and Near-Memory Processing
Ahn, Hooyoung
Kim, Seonyoung
Park, Yoomi
Han, Woojong
Contini, Nick
Ramesh, Bharath
Abduljabbar, Mustafa
Panda, Dhabaleswar K.
2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 608 - 615
[23] AOI-Based Data-Centric Circuits for Near-Memory Processing
Junsangsri, Salin
Lombardi, Fabrizio
PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES (NANOARCH 2017), 2017, : 7 - 12
[24] A Precision -Optimized Fixed -Point Near-Memory Digital Processing Unit for Analog In -Memory Computing
Ferro, Elena
Vasilopoulos, Athanasios
Lammie, Corey
Le Gallo, Manuel
Benini, Luca
Boybat, Irem
Sebastian, Abu
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[25] Cache Register Sharing Structure for Channel-level Near-memory Processing in NAND Flash Memory
Kim, Hyunwoo
Lee, Hyundong
Kim, Jongbeom
Go, Yunjeong
Baek, Seungwon
Song, Jaehong
Kim, Junhyeon
Jung, Minyoung
Kim, Hyodong
Kim, Seongju
Song, Taigon
2023 24TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED, 2023, : 718 - 723
[26] Coherently Attached Programmable Near-Memory Acceleration Platform and its application to Stencil Processing
van Lunteren, Jan
Luijten, Ronald
Diamantopoulos, Dionysios
Auernhammer, Florian
Hagleitner, Christoph
Chelini, Lorenzo
Corda, Stefano
Singh, Gagandeep
2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 668 - 673
[27] TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Kwon, Youngeun
Lee, Yunjae
Rhu, Minsoo
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 740 - 753
[28] HybriDS: Cache-Conscious Concurrent Data Structures for Near-Memory Processing Architectures
Choe, Jiwon
Crotty, Andrew
Moreshet, Tali
Herlihy, Maurice
Bahar, R. Iris
PROCEEDINGS OF THE 34TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2022, 2022, : 321 - 332
[29] DIMM-Link: Enabling Efficient Inter-DIMM Communication for Near-Memory Processing
Zhou, Zhe
Li, Cong
Yang, Fan
Sun, Guangyu
2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 302 - 316
[30] SARDIMM: High-Speed Near-Memory Processing Architecture for Synthetic Aperture Radar Imaging
Kim, Haechan
Heo, Jinmoo
Lee, Seongjoo
Jung, Yunho
APPLIED SCIENCES-BASEL, 2024, 14 (17):

← 1 2 3 4 →