HeNCoG: A Heterogeneous Near-memory Computing Architecture for Energy Efficient GCN Acceleration

被引:0
|
作者
Hwang, Seung-Eon [1 ]
Song, Duyeong [1 ]
Park, Jongsun [1 ]
机构
[1] Korea Univ, Sch Elect Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Graph Convolutional Network; Sparse Matrix Multiplication; Near-memory Computing; Domain Specific Accelerator; PERFORMANCE;
D O I
10.1109/ISCAS58744.2024.10558133
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graph convolutional network (GCN), which first applies convolutional operations to process graph data, has gained attention in various tasks involving relational data. Previous GCN accelerators have been designed with heterogeneous cores, considering two stages of inference (aggregation and combination), or with a unified core based on the inference of multi layer as an iterative sparse-dense matrix multiplication. However, those prior works have suffered from an unnecessary large number of multiply-accumulate (MAC) operations and/or main memory accesses. In this paper, we propose HeNCoG, a GCN accelerator that utilizes a heterogeneous MAC array core for the combination stage and a near-memory computing core for the aggregation stage. In HeNCoG, considering that the number of MAC operations is significantly reduced when changing the stage execution order, the combination stage is executed first with a row-stationary dataflow. In the aggregation stage, magneto-resistive random-access memory (MRAM)-based near-memory computing is employed to reduce the number of main memory accesses needed to access the adjacency matrix in the graph dataset. Graph partitioning and double buffering techniques are also applied to further improve hardware efficiencies. Simulation results show that the HeNCoG architecture reduces execution cycles by 97% and memory accesses by 42% compared to previous works.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Triple Engine Processor (TEP): A Heterogeneous Near-Memory Processor for Diverse Kernel Operations
    Lim, Hongyeol
    Park, Giho
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (04)
  • [42] TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning
    Kwon, Youngeun
    Lee, Yunjae
    Rhu, Minsoo
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 740 - 753
  • [43] NEMO-CNN: An Efficient Near-Memory Accelerator for Convolutional Neural Networks
    Brown, Grant
    Tenace, Valerio
    Gaillardon, Pierre-Emmanuel
    2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, : 57 - 60
  • [44] FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction
    Asgari, Bahar
    Hadidi, Ramyad
    Cao, Jiashen
    Shim, Da Eun
    Lim, Sung-Kyu
    Kim, Hyesoon
    2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 908 - 920
  • [45] Work-in-Progress: Efficient Low-latency Near-Memory Addition
    Reaugh, Alexander
    Salehi, Sayed Ahmad
    2022 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURE, AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES 2022), 2022, : 33 - 34
  • [46] Dagger: Efficient and Fast RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs
    Lazarev, Nikita
    Xiang, Shaojie
    Adit, Neil
    Zhang, Zhiru
    Delimitrou, Christina
    ASPLOS XXVI: TWENTY-SIXTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2021, : 36 - 51
  • [47] Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs
    Lazarev, Nikita
    Adit, Neil
    Xiang, Shaojie
    Zhang, Zhiru
    Delimitrou, Christina
    IEEE COMPUTER ARCHITECTURE LETTERS, 2020, 19 (02) : 134 - 138
  • [48] A Memory-Optimized and Energy-Efficient CNN Acceleration Architecture Based on FPGA
    Chang, Xuepeng
    Pan, Huihui
    Zhang, Dun
    Sun, Qiming
    Lin, Weiyang
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2137 - 2141
  • [49] SACHI: A Stationarity-Aware, All-Digital, Near-Memory, Ising Architecture
    Raman, Siddhartha Raman Sundara
    John, Lizy K.
    Kulkarni, Jaydeep P.
    2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024, 2024, : 719 - 731
  • [50] Novel Bit-Sliced Near-Memory Computing Based VLSI Architecture for Fast Sobel Edge Detection in IoT Edge Devices
    Joshi, Rajeev
    Zaman, Md Adnan
    Katkoori, Srinivas
    2020 6TH IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2020) (FORMERLY INIS), 2020, : 291 - 296