Distributed Recommendation Inference on FPGA Clusters

被引:9
|
作者
Zhu, Yu [1 ]
He, Zhenhao [1 ]
Jiang, Wenqi [1 ]
Zeng, Kai [2 ]
Zhou, Jingren [2 ]
Alonso, Gustavo [1 ]
机构
[1] Swiss Fed Inst Technol, Syst Grp, Zurich, Switzerland
[2] Alibaba Grp, Hangzhou, Peoples R China
来源
2021 31ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2021) | 2021年
关键词
D O I
10.1109/FPL53798.2021.00057
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks are widely used in personalized recommendation systems. Such models involve two major components: the memory-bound embedding layer and the computation-bound fully-connected layers. Existing solutions are either slow on both stages or only optimize one of them. To implement recommendation inference efficiently in the context of a real deployment, we design and implement an FPGA cluster optimizing the performance of both stages. To remove the memory bottleneck, we take advantage of the High-Bandwidth Memory (HBM) available on the latest FPGAs for highly concurrent embedding table lookups. To match the required DNN computation throughput, we partition the workload across multiple FPGAs interconnected via a 100 Gbps TCP/IP network. Compared to an optimized CPU baseline (16 vCPU, AVX2-enabled) and a one-node FPGA implementation, our system (four-node version) achieves 28.95x and 7.68x speedup in terms of throughput respectively. The proposed system also guarantees a latency of tens of microseconds per single inference, significantly better than CPU and GPU-based systems which take at least milliseconds.
引用
收藏
页码:279 / 285
页数:7
相关论文
共 50 条
  • [21] Skyline Recommendation in Distributed Networks
    Huang, Zhenhua
    Zhang, Jiawen
    Liu, Zheng
    Zhang, Bo
    Wang, Dong
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 372 - 379
  • [22] A Distributed Fuzzy Recommendation System
    Abhari, Abdolreza
    Ghasemi, Touhid
    15TH COMMUNICATIONS AND NETWORKING SYMPOSIUM 2012 (CNS 2012), 2012, 44 (03): : 96 - 101
  • [23] Inference with Few Heterogeneous Clusters
    Ibragimov, Rustam
    Mueller, Ulrich K.
    REVIEW OF ECONOMICS AND STATISTICS, 2016, 98 (01) : 83 - 96
  • [24] Causal Inference for Knowledge Graph Based Recommendation
    Wei, Yinwei
    Wang, Xiang
    Nie, Liqiang
    Li, Shaoyu
    Wang, Dingxian
    Chua, Tat-Seng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11153 - 11164
  • [25] Recommendation system using multistrategy inference and learning
    Sniezynski, B
    ADVANCES IN WEB INTELLIGENCE, PROCEEDINGS, 2005, 3528 : 421 - 426
  • [26] Relative contrast estimation and inference for treatment recommendation
    Liang, Muxuan
    Yu, Menggang
    BIOMETRICS, 2023, 79 (04) : 2920 - 2932
  • [27] A Survey on Debiasing Recommendation Based on Causal Inference
    Yang, Xin-Xin
    Liu, Zhen
    Lu, Si-Bo
    Yuan, Ya-Fan
    Sun, Yong-Qi
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (10): : 2307 - 2332
  • [28] Optimizing Inference Quality with SmartNIC for Recommendation System
    Shi, Ruixin
    Yan, Ming
    Wu, Jie
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [29] An algorithm for distributed Bayesian inference
    Shyamalkumar, Nariankadu D.
    Srivastava, Sanvesh
    STAT, 2022, 11 (01):
  • [30] Selective Inference with Distributed Data
    Liu, Sifan
    Panigrahi, Snigdha
    JOURNAL OF MACHINE LEARNING RESEARCH, 2025, 26