HeNCoG: A Heterogeneous Near-memory Computing Architecture for Energy Efficient GCN Acceleration

被引：0

作者：

Hwang, Seung-Eon ^{[1
]}

Song, Duyeong ^{[1
]}

Park, Jongsun ^{[1
]}

机构：

[1] Korea Univ, Sch Elect Engn, Seoul, South Korea

来源：

2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年

基金：

新加坡国家研究基金会;

关键词：

Graph Convolutional Network; Sparse Matrix Multiplication; Near-memory Computing; Domain Specific Accelerator; PERFORMANCE;

D O I：

10.1109/ISCAS58744.2024.10558133

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Graph convolutional network (GCN), which first applies convolutional operations to process graph data, has gained attention in various tasks involving relational data. Previous GCN accelerators have been designed with heterogeneous cores, considering two stages of inference (aggregation and combination), or with a unified core based on the inference of multi layer as an iterative sparse-dense matrix multiplication. However, those prior works have suffered from an unnecessary large number of multiply-accumulate (MAC) operations and/or main memory accesses. In this paper, we propose HeNCoG, a GCN accelerator that utilizes a heterogeneous MAC array core for the combination stage and a near-memory computing core for the aggregation stage. In HeNCoG, considering that the number of MAC operations is significantly reduced when changing the stage execution order, the combination stage is executed first with a row-stationary dataflow. In the aggregation stage, magneto-resistive random-access memory (MRAM)-based near-memory computing is employed to reduce the number of main memory accesses needed to access the adjacency matrix in the graph dataset. Graph partitioning and double buffering techniques are also applied to further improve hardware efficiencies. Simulation results show that the HeNCoG architecture reduces execution cycles by 97% and memory accesses by 42% compared to previous works.

引用

页数：5

共 50 条

[41] Triple Engine Processor (TEP): A Heterogeneous Near-Memory Processor for Diverse Kernel Operations
Lim, Hongyeol
Park, Giho
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (04)
[42] TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
Kwon, Youngeun
Lee, Yunjae
Rhu, Minsoo
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 740 - 753
[43] NEMO-CNN: An Efficient Near-Memory Accelerator for Convolutional Neural Networks
Brown, Grant
Tenace, Valerio
Gaillardon, Pierre-Emmanuel
2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, : 57 - 60
[44] FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction
Asgari, Bahar
Hadidi, Ramyad
Cao, Jiashen
Shim, Da Eun
Lim, Sung-Kyu
Kim, Hyesoon
2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 908 - 920
[45] Work-in-Progress: Efficient Low-latency Near-Memory Addition
Reaugh, Alexander
Salehi, Sayed Ahmad
2022 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURE, AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES 2022), 2022, : 33 - 34
[46] Dagger: Efficient and Fast RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs
Lazarev, Nikita
Xiang, Shaojie
Adit, Neil
Zhang, Zhiru
Delimitrou, Christina
ASPLOS XXVI: TWENTY-SIXTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2021, : 36 - 51
[47] Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs
Lazarev, Nikita
Adit, Neil
Xiang, Shaojie
Zhang, Zhiru
Delimitrou, Christina
IEEE COMPUTER ARCHITECTURE LETTERS, 2020, 19 (02) : 134 - 138
[48] A Memory-Optimized and Energy-Efficient CNN Acceleration Architecture Based on FPGA
Chang, Xuepeng
Pan, Huihui
Zhang, Dun
Sun, Qiming
Lin, Weiyang
2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2137 - 2141
[49] SACHI: A Stationarity-Aware, All-Digital, Near-Memory, Ising Architecture
Raman, Siddhartha Raman Sundara
John, Lizy K.
Kulkarni, Jaydeep P.
2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024, 2024, : 719 - 731
[50] Novel Bit-Sliced Near-Memory Computing Based VLSI Architecture for Fast Sobel Edge Detection in IoT Edge Devices
Joshi, Rajeev
Zaman, Md Adnan
Katkoori, Srinivas
2020 6TH IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2020) (FORMERLY INIS), 2020, : 291 - 296

← 1 2 3 4 5 →