Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses

被引:14
|
作者
Asiatici, Mikhail [1 ]
Ienne, Paolo [1 ]
机构
[1] Ecole Polytech Fed Lausanne EPFL, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland
关键词
graph; MOMS; nonblocking cache; DRAM; FPGA; GPU;
D O I
10.1109/ISCA52012.2021.00054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient large-scale graph processing is crucial to many disciplines. Yet, while graph algorithms naturally expose massive parallelism opportunities, their performance is limited by the memory system because of irregular memory accesses. State-of-the-art FPGA graph processors, such as ForeGraph and FabGraph, address the memory issues by using scratchpads and regularly streaming edges from DRAM, but then they end up wasting bandwidth on unneeded data. Yet, where classic caches and scratchpads fail to deliver, FPGAs make powerful unorthodox solutions possible. In this paper, we resort to extreme nonblocking caches that handle tens of thousands of outstanding read misses. They significantly increase the ability of memory systems to coalesce multiple accelerator accesses into fewer DRAM memory requests; essentially, when latency is not the primary concern, they bring the advantages expected from a very large cache at a fraction of the cost. We prove our point with an adaptable graph accelerator running on Amazon AWS f1; our implementation takes into account all practical aspects of such a design, including the challenges involved when working with modern multidie FPGAs. Running classic algorithms (PageRank, SCC, and SSSP) on large graphs, we achieve 3x geometric mean speedup compared to state-of-the-art FPGA accelerators, 1.1-5.8x higher bandwidth efficiency and 3.0-15.3x better power efficiency than multicore CPUs, and we support much larger graphs than the state-of-the-art on GPUs.
引用
收藏
页码:609 / 622
页数:14
相关论文
共 50 条
  • [1] Distributed large-scale graph processing on FPGAs
    Amin Sahebi
    Marco Barbone
    Marco Procaccini
    Wayne Luk
    Georgi Gaydadjiev
    Roberto Giorgi
    [J]. Journal of Big Data, 10
  • [2] Distributed large-scale graph processing on FPGAs
    Sahebi, Amin
    Barbone, Marco
    Procaccini, Marco
    Luk, Wayne
    Gaydadjiev, Georgi
    Giorgi, Roberto
    [J]. JOURNAL OF BIG DATA, 2023, 10 (01)
  • [3] NewGraph: Balanced Large-scale Graph Processing on FPGAs with Low Preprocessing Overheads
    Dai, Guohao
    Huang, Tianhao
    Wang, Yu
    Yang, Huazhong
    Wawrzynek, John
    [J]. PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 208 - 208
  • [4] Large-scale graph processing systems: a survey
    Liu, Ning
    Li, Dong-sheng
    Zhang, Yi-ming
    Li, Xiong-lve
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (03) : 384 - 404
  • [5] Large-scale graph processing systems: a survey
    Ning Liu
    Dong-sheng Li
    Yi-ming Zhang
    Xiong-lve Li
    [J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 384 - 404
  • [6] Towards Large-Scale Graph Stream Processing Platform
    Suzumura, Toyotaro
    Nishii, Shunsuke
    Ganse, Masaru
    [J]. WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 1321 - 1326
  • [7] Optimizing Differential Computation for Large-Scale Graph Processing
    Sahu, Siddhartha
    Salihoglu, Semih
    [J]. PROCEEDINGS OF THE 7TH ACM SIGMOD JOINT INTERNATIONAL WORKSHOP ON GRAPH DATA MANAGEMENT EXPERIENCES & SYSTEMS, GRADES 2024 AND NETWORK DATA ANALYTICS, NDA 2024, GRADES-NDA 2024, 2024,
  • [8] A Novel Clustering Algorithm for Large-Scale Graph Processing
    Qu, Zhaoyang
    Ding, Wei
    Qu, Nan
    Yan, Jia
    Wang, Ling
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2016, PT III, 2016, 9773 : 349 - 358
  • [9] Large-Scale Graph Processing on Emerging Storage Devices
    Elyasi, Nima
    Choi, Changho
    Sivasubramaniam, Anand
    [J]. PROCEEDINGS OF THE 17TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, 2019, : 309 - 316
  • [10] Large-scale Cellular Automata on FPGAs
    Kyparissas, Nikolaos
    Dollas, Apostolos
    [J]. ACM Transactions on Reconfigurable Technology and Systems, 2020, 14 (01):