HeNCoG: A Heterogeneous Near-memory Computing Architecture for Energy Efficient GCN Acceleration

被引:0
|
作者
Hwang, Seung-Eon [1 ]
Song, Duyeong [1 ]
Park, Jongsun [1 ]
机构
[1] Korea Univ, Sch Elect Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Graph Convolutional Network; Sparse Matrix Multiplication; Near-memory Computing; Domain Specific Accelerator; PERFORMANCE;
D O I
10.1109/ISCAS58744.2024.10558133
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graph convolutional network (GCN), which first applies convolutional operations to process graph data, has gained attention in various tasks involving relational data. Previous GCN accelerators have been designed with heterogeneous cores, considering two stages of inference (aggregation and combination), or with a unified core based on the inference of multi layer as an iterative sparse-dense matrix multiplication. However, those prior works have suffered from an unnecessary large number of multiply-accumulate (MAC) operations and/or main memory accesses. In this paper, we propose HeNCoG, a GCN accelerator that utilizes a heterogeneous MAC array core for the combination stage and a near-memory computing core for the aggregation stage. In HeNCoG, considering that the number of MAC operations is significantly reduced when changing the stage execution order, the combination stage is executed first with a row-stationary dataflow. In the aggregation stage, magneto-resistive random-access memory (MRAM)-based near-memory computing is employed to reduce the number of main memory accesses needed to access the adjacency matrix in the graph dataset. Graph partitioning and double buffering techniques are also applied to further improve hardware efficiencies. Simulation results show that the HeNCoG architecture reduces execution cycles by 97% and memory accesses by 42% compared to previous works.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Coherently Attached Programmable Near-Memory Acceleration Platform and its application to Stencil Processing
    van Lunteren, Jan
    Luijten, Ronald
    Diamantopoulos, Dionysios
    Auernhammer, Florian
    Hagleitner, Christoph
    Chelini, Lorenzo
    Corda, Stefano
    Singh, Gagandeep
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 668 - 673
  • [32] Toward Energy-efficient STT-MRAM-based Near Memory Computing Architecture for Embedded Systems
    Li, Yueting
    Wang, Xueyan
    Zhang, He
    Pan, Biao
    Qiu, Keni
    Kang, Wang
    Wang, Jun
    Zhao, Weisheng
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (03)
  • [33] NAPEL: Near-Memory Computing Application Performance Prediction via Ensemble Learning
    Singh, Gagandeep
    Gomez-Luna, Juan
    Mariani, Giovanni
    Oliveira, Gerald F.
    Corda, Stefano
    Stuijk, Sander
    Mutlu, Onur
    Corporaal, Henk
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [34] A Precision -Optimized Fixed -Point Near-Memory Digital Processing Unit for Analog In -Memory Computing
    Ferro, Elena
    Vasilopoulos, Athanasios
    Lammie, Corey
    Le Gallo, Manuel
    Benini, Luca
    Boybat, Irem
    Sebastian, Abu
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [35] FPGA-Based Near-Memory Acceleration of Modern Data-Intensive Applications
    Singh, Gagandeep
    Alser, Mohammed
    Cali, Damla Senol
    Diamantopoulos, Dionysios
    Gomez-Luna, Juan
    Corporaal, Henk
    Mutlu, Onur
    IEEE MICRO, 2021, 41 (04) : 39 - 48
  • [36] PageForge: A Near-Memory Content-Aware Page-Merging Architecture
    Skarlatos, Dimitrios
    Kim, Nam Sung
    Torrellas, Josep
    50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 302 - 314
  • [37] A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets
    Schuiki, Fabian
    Schaffner, Michael
    Gurkaynak, Frank K.
    Benini, Luca
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (04) : 484 - 497
  • [38] STT-MRAM-based Near-Memory Computing Architecture with Read Scheme and Dataflow Co-Design for High-Throughput and Energy-Efficiency
    Jang, Yunho
    Kim, Yeseul
    Park, Jongsun
    PROCEEDINGS OF THE 29TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2024, 2024,
  • [39] Acceleration of Bulk Memory Operations in a Heterogeneous Multicore Architecture
    Lee, JongHyuk
    Liu, Ziyi
    Tian, Xiaonan
    Woo, Dong Hyuk
    Shi, Weidong
    Boumber, Dainis
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'12), 2012, : 423 - 424
  • [40] HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration
    Dhingra, Pratyush
    Doppa, Janardhan Rao
    Pande, Partha Pratim
    PROCEEDINGS OF THE 29TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2024, 2024,