Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses

被引:14
|
作者
Asiatici, Mikhail [1 ]
Ienne, Paolo [1 ]
机构
[1] Ecole Polytech Fed Lausanne EPFL, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland
关键词
graph; MOMS; nonblocking cache; DRAM; FPGA; GPU;
D O I
10.1109/ISCA52012.2021.00054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient large-scale graph processing is crucial to many disciplines. Yet, while graph algorithms naturally expose massive parallelism opportunities, their performance is limited by the memory system because of irregular memory accesses. State-of-the-art FPGA graph processors, such as ForeGraph and FabGraph, address the memory issues by using scratchpads and regularly streaming edges from DRAM, but then they end up wasting bandwidth on unneeded data. Yet, where classic caches and scratchpads fail to deliver, FPGAs make powerful unorthodox solutions possible. In this paper, we resort to extreme nonblocking caches that handle tens of thousands of outstanding read misses. They significantly increase the ability of memory systems to coalesce multiple accelerator accesses into fewer DRAM memory requests; essentially, when latency is not the primary concern, they bring the advantages expected from a very large cache at a fraction of the cost. We prove our point with an adaptable graph accelerator running on Amazon AWS f1; our implementation takes into account all practical aspects of such a design, including the challenges involved when working with modern multidie FPGAs. Running classic algorithms (PageRank, SCC, and SSSP) on large graphs, we achieve 3x geometric mean speedup compared to state-of-the-art FPGA accelerators, 1.1-5.8x higher bandwidth efficiency and 3.0-15.3x better power efficiency than multicore CPUs, and we support much larger graphs than the state-of-the-art on GPUs.
引用
收藏
页码:609 / 622
页数:14
相关论文
共 50 条
  • [21] Loopback: Exploiting collaborative caches for large-scale streaming
    Kusmierek, E
    Dong, YF
    Du, D
    [J]. MULTIMEDIA COMPUTING AND NETWORKING 2005, 2005, 5680 : 65 - 76
  • [22] Loopback: Exploiting collaborative caches for large-scale streaming
    Kusmierek, E
    Dong, YF
    Du, DHC
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (02) : 233 - 242
  • [23] Performance tuning of large-scale distributed WWW caches
    Srbljic, S
    Milanovic, A
    Hadjina, N
    [J]. MELECON 2000: INFORMATION TECHNOLOGY AND ELECTROTECHNOLOGY FOR THE MEDITERRANEAN COUNTRIES, VOLS 1-3, PROCEEDINGS, 2000, : 93 - 96
  • [24] Marbor: A Novel Large-Scale Graph Data Storage and Processing Framework
    Zhou, Wei
    Gao, Yun
    Han, Jizhong
    Xu, Zhiyong
    [J]. 2014 IEEE INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2014,
  • [25] Concept of Parallel Graph Processing System for Large-Scale Network Science
    Chernoskutov, Mikhail
    [J]. 2017 INTERNATIONAL MULTI-CONFERENCE ON ENGINEERING, COMPUTER AND INFORMATION SCIENCES (SIBIRCON), 2017, : 206 - 208
  • [26] Execution Feature Extraction and Prediction for Large-scale Graph Processing Applications
    Li, Fangyuan
    [J]. 2019 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2019, : 84 - 89
  • [27] Highly Scalable Large-Scale Asynchronous Graph Processing using Actors
    Elmougy, Youssef
    Hayashi, Akihiro
    Sarkar, Vivek
    [J]. 2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING WORKSHOPS, CCGRIDW, 2023, : 242 - 248
  • [28] Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud
    Zhong, Jianlong
    He, Bingsheng
    [J]. 2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, : 9 - 16
  • [29] GStream: A Graph Streaming Processing Method for Large-Scale Graphs on GPUs
    Seo, Hyunseok
    Kim, Jinwook
    Kim, Min-Soo
    [J]. ACM SIGPLAN NOTICES, 2015, 50 (08) : 253 - 254
  • [30] Highly Scalable Large-Scale Asynchronous Graph Processing using Actors
    Elmougy, Youssef
    Hayashi, Akihiro
    Sarkar, Vivek
    [J]. Proceedings - 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing Workshops, CCGridW 2023, 2023, : 242 - 248