Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

被引:4
|
作者
Choudhury, Dwaipayan [1 ]
Barik, Reet [1 ]
Rajam, Aravind Sukumaran [1 ]
Kalyanaraman, Ananth [1 ]
Pande, Partha Pratim [1 ]
机构
[1] Washington State Univ, Sch Elect Engn & Comp Sci, POB 642752, Pullman, WA 99164 USA
基金
美国国家科学基金会;
关键词
Software/Hardware Co-design; vertex reordering; graph analytics; GPU manycore; small world NoC; SPACE EXPLORATION; PERFORMANCE; OPTIMIZATION; POWER;
D O I
10.1145/3514354
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Manycore GPU architectures have become the mainstay for accelerating graph computations. One of the primary bottlenecks to performance of graph computations on manycore architectures is the data movement. Since most of the accesses in graph processing are due to vertex neighborhood lookups, locality in graph data structures plays a key role in dictating the degree of data movement. Vertex reordering is a widely used technique to improve data locality within graph data structures. However, these reordering schemes alone are not sufficient as they need to be complemented with efficient task allocation on manycore GPU architectures to reduce latency due to local cache misses. Consequently, in this article, we introduce a software/hardware co-design framework for accelerating graph computations. Our approach couples an architecture-aware vertex reordering with a priority-based task allocation technique. As the task allocation aims to reduce on-chip latency and associated energy, the choice of Network-on-Chip (NoC) as the communication backbone in the manycore platform is an important parameter. By leveraging emerging three-dimensional (3D) integration technology, we propose design of a small-world NoC (SWNoC)-enabled manycore GPU architecture, where the placement of the links connecting the streaming multiprocessors (SMs) and thememory controllers (MCs) follow a power-law distribution. The proposed 3D SWNoC-enabled software/hardware co-design framework achieves 11.1% to 22.9% performance improvement and 16.4% to 32.6% less energy consumption depending on the dataset and the graph application, when compared to the default order of dataset running on a conventional planar mesh architecture.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] NoC-based hardware software co-design framework for dataflow thread management
    Mazumdar, Somnath
    Scionti, Alberto
    Zuckerman, Stephane
    Portero, Antoni
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (16): : 17983 - 18020
  • [2] NoC-based hardware software co-design framework for dataflow thread management
    Somnath Mazumdar
    Alberto Scionti
    Stéphane Zuckerman
    Antoni Portero
    [J]. The Journal of Supercomputing, 2023, 79 : 17983 - 18020
  • [3] Accelerating Graph Computations on 3D NoC-Enabled PIM Architectures
    Choudhury, Dwaipayan
    Xiang, Lizhi
    Rajam, Aravind
    Kalyanaraman, Anantharaman
    Pande, Partha Pratim
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (03)
  • [4] Hardware-software co-design of embedded reconfigurable architectures
    Li, YB
    Callahan, T
    Darnell, E
    Harr, R
    Kurkure, U
    Stockwood, J
    [J]. 37TH DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2000, 2000, : 507 - 512
  • [5] Internet-based hardware/software co-design framework for embedded 3D graphics applications
    Chi-Tsai Yeh
    Chun-Hao Wang
    Ing-Jer Huang
    Weng-Fai Wong
    [J]. EURASIP Journal on Advances in Signal Processing, 2011
  • [6] Internet-based hardware/software co-design framework for embedded 3D graphics applications
    Yeh, Chi-Tsai
    Wang, Chun-Hao
    Huang, Ing-Jer
    Wong, Weng-Fai
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [7] An Adaptive Neighborhood Taboo Search on GPU for Hardware/Software Co-design
    Hou, Neng
    He, Fazhi
    Chen, Yilin
    Zhou, Yi
    [J]. 2016 IEEE 20TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2016, : 239 - 244
  • [8] DALI:: A methodology for the co-design of dataflow applications on hardware/software architectures
    Véstias, MP
    Neto, HC
    [J]. 16TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN, SBCCI 2003, PROCEEDINGS, 2003, : 85 - 90
  • [9] Improving Utilization of Dataflow Architectures Through Software and Hardware Co-Design
    Fan, Zhihua
    Li, Wenming
    Tang, Shengzhong
    An, Xuejun
    Ye, Xiaochun
    Fan, Dongrui
    [J]. EURO-PAR 2023: PARALLEL PROCESSING, 2023, 14100 : 245 - 259
  • [10] Hardware-accelerated Implementation of EMD Hardware and Software Co-design Evalution for HHT
    Wang, Lei
    Vai, Mang I.
    Mak, Peng Un
    Ieong, Chio In
    [J]. 2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 912 - 915