FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability

被引:8
|
作者
Shi, Runbin [1 ]
Ding, Yuhao [1 ]
Wei, Xuechao [2 ]
Li, He [3 ]
Liu, Hang [4 ]
So, Hayden K. H. [1 ]
Ding, Caiwen [5 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
[3] Univ Cambridge, Cambridge, England
[4] Stevens Inst Technol, Hoboken, NJ 07030 USA
[5] Univ Connecticut, Storrs, CT USA
来源
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC) | 2020年
基金
美国国家科学基金会;
关键词
D O I
10.1109/dac18072.2020.9218581
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fast inference is of paramount value to a wide range of deep learning applications. This work presents FTDL, a highly-scalable FPGA overlay framework for deep learning applications, to address the architecture and hardware mismatch faced by traditional efforts. The FTDL overlay is specifically optimized for the tiled structure of FPGAs, thereby achieving post-place-and-route operating frequencies exceeding 88 % of the theoretical maximum across different devices and design scales. A flexible compilation framework efficiently schedules matrix multiply and convolution operations of large neural network inference on the overlay and achieved over 80 % hardware efficiency on average. Taking advantage of both high operating frequency and hardware efficiency, FTDL achieves 402.6 and 151.2 FPS with GoogLeNet and ResNet50 on ImageNet, respectively, while operating at a power efficiency of 27.6 GOPS/W, making it up to 7.7x higher performance and 1.9x more power-efficient than the state-of-the-art.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Adaptive Deep Models for Incremental Learning: Considering Capacity Scalability and Sustainability
    Yang, Yang
    Zhou, Da-Wei
    Zhan, De-Chuan
    Xiong, Hui
    Jiang, Yuan
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 74 - 82
  • [22] Nanosecond machine learning regression with deep boosted decision trees in FPGA for high energy physics
    Carlson, B. T.
    Bayer, Q.
    Hong, T. M.
    Roche, S. T.
    JOURNAL OF INSTRUMENTATION, 2022, 17 (09):
  • [23] FPGA Accelerates Deep Residual Learning for Image Recognition
    Li, Xuelei
    Ding, Liangkui
    Wang, Li
    Cao, Fang
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 837 - 840
  • [24] Parallel Dot-Products for Deep Learning on FPGA
    Vestias, Mario
    Duarte, Rui Policarpo
    de Sousa, Jose T.
    Neto, Horacio
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [25] Evaluating Embedded FPGA Accelerators for Deep Learning Applications
    Hegde, Gopalakrishna
    Siddhartha
    Ramasamy, Nachiappan
    Buddha, Vamsi
    Kapre, Nachiket
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 25 - 25
  • [26] A Deep Learning prediction process accelerator based FPGA
    Yu, Qi
    Wang, Chao
    Ma, Xiang
    Li, Xi
    Zhou, Xuehai
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 1159 - 1162
  • [27] DLAU: A Scalable Deep Learning Accelerator Unit on FPGA
    Wang, Chao
    Gong, Lei
    Yu, Qi
    Li, Xi
    Xie, Yuan
    Zhou, Xuehai
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (03) : 513 - 517
  • [28] A New Similarity Space Tailored for Supervised Deep Metric Learning
    Barros, Pedro
    Queiroz, Fabiane
    Figueiredo, Flavio
    Dos Santos, Jefersson A.
    Ramos, Heitor
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (01)
  • [29] Research on OpenCL optimization for FPGA deep learning application
    Zhang, Shuo
    Wu, Yanxia
    Men, Chaoguang
    He, Hongtao
    Liang, Kai
    PLOS ONE, 2019, 14 (10):
  • [30] A Customizable Domain-Specific Memory-Centric FPGA Overlay for Machine Learning Applications
    Panahi, Atiyehsadat
    Balsalama, Suhail
    Ishimwe, Ange-Thierry
    Mbongue, Joel Mandebi
    Andrews, David
    2021 31ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2021), 2021, : 24 - 27