xDNN: Inference for Deep Convolutional Neural Networks

被引：4

作者：

D'Alberto, Paolo ^{[1
]}

Wu, Victor ^{[1
]}

Ng, Aaron ^{[1
]}

Nimaiyar, Rahul ^{[1
]}

Delaye, Elliott ^{[1
]}

Sirasao, Ashish ^{[2
]}

机构：

[1] Xilinx, Log Dr, San Jose, CA 95124 USA

[2] FaceBook, 1 Hacker Way, Menlo Pk, CA 94025 USA

来源：

ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS | 2022年 / 15卷 / 02期

关键词：

Al inference; low latency; high efficiency; custom architectures; optimizations;

D O I：

10.1145/3473334

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present xDNN, an end-to-end system for deep-learning inference based on a family of specialized hardware processors synthesized on Field-Programmable Gate Array (FPGAs) and Convolution Neural Networks (CNN). We present a design optimized for low latency, high throughput, and high compute efficiency with no batching. The design is scalable and a parametric function of the number of multiply-accumulate units, on-chip memory hierarchy, and numerical precision. The design can produce a scale-down processor for embedded devices, replicated to produce more cores for larger devices, or resized to optimize efficiency. On Xilinx Virtex Ultrascale+ VU13P FPGA, we achieve 800 MHz that is close to the Digital Signal Processing maximum frequency and above 80% efficiency of on-chip compute resources. On top of our processor family, we present a runtime system enabling the execution of different networks for different input sizes (i.e., from 224 x 224 to 2048 x 1024). We present a compiler that reads CNNs from native frameworks (i.e., MXNet, Caffe, Keras, and Tensorflow), optimizes them, generates codes, and provides performance estimates. The compiler combines quantization information from the native environment and optimizations to feed the runtime with code as efficient as any hardware expert could write. We present tools partitioning a CNN into subgraphs for the division of work to CPU cores and FPGAs. Notice that the software will not change when or if the FPGA design becomes an ASIC, making our work vertical and not just a proof-of-concept FPGA project. We show experimental results for accuracy, latency, and power for several networks: In summary, we can achieve up to 4 times higher throughput, 3 times better power efficiency than the GPUs, and up to 20 times higher throughput than the latest CPUs. To our knowledge, we provide solutions faster than any previous FPGA-based solutions and comparable to any other top-of-the-shelves solutions.

引用

页数：29

共 50 条

[1] Towards explainable deep neural networks (xDNN)
Angelov, Plamen
Soares, Eduardo
[J]. NEURAL NETWORKS, 2020, 130 : 185 - 194
[2] Deep inference: A Convolutional Neural Networks Method for Parameter Recovery of the Fractional Dynamics
Biranvand, N.
Hadian-Rasanan, A. H.
Khalili, A.
Rad, J. A.
[J]. INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (01): : 189 - 201
[3] Deep Convolutional Neural Networks
Gonzalez, Rafael C.
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (06) : 79 - 87
[4] Simulating quantized inference on convolutional neural networks
Finotti, Vitor
Albertini, Bruno
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 95
[5] Simulating quantized inference on convolutional neural networks
Finotti, Vitor
Albertini, Bruno
[J]. Computers and Electrical Engineering, 2021, 95
[6] FPGA based Flexible Implementation of Light Weight Inference on Deep Convolutional Neural Networks
Dawwd, Shefa
[J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2024, 21 (03) : 408 - 417
[7] Efficient adaptive inference for deep convolutional neural networks using hierarchical early exits
Passalis, Nikolaos
Raitoharju, Jenni
Tefas, Anastasios
Gabbouj, Moncef
[J]. PATTERN RECOGNITION, 2020, 105
[8] DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
Farrell, Max H.
Liang, Tengyuan
Misra, Sanjog
[J]. ECONOMETRICA, 2021, 89 (01) : 181 - 213
[9] Property Inference for Deep Neural Networks
Gopinath, Divya
Converse, Hayes
Pasareanu, Corina S.
Taly, Ankur
[J]. 34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 809 - 821
[10] Deep Anchored Convolutional Neural Networks
Huang, Jiahui
Dwivedi, Kshitij
Roig, Gemma
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 639 - 647

← 1 2 3 4 5 →