Design-Space Exploration of Quantized Transposed Convolutional Neural Networks for FPGA-based Systems-on-Chip

被引:1
|
作者
Sestito, Cristian [1 ,3 ]
Perri, Stefania [2 ]
Stewart, Robert [3 ]
机构
[1] Univ Calabria, Dept Informat Modeling Elect & Syst Engn, Arcavacata Di Rende, Italy
[2] Univ Calabria, Dept Mech Energy & Management Engn, Arcavacata Di Rende, Italy
[3] Heriot Watt Univ, Dept Comp Sci, Edinburgh, Midlothian, Scotland
来源
2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH) | 2022年
关键词
transposed convolution layers; quantization; field programmable gate arrays (FPGAs); reconfigurable systems-on-chip; ARTIFICIAL-INTELLIGENCE; INTERNET; TRENDS; THINGS;
D O I
10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927825
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the shift of deep learning applications to Edge Computing devices, compression techniques have been introduced to minimize hardware use, power consumption and latency. For example, quantization uses low numeric precision to represent inputs, parameters and activation functions. Transposed Convolutions (TCONVs) provide neural networks with image up-sampling capabilities. However, the accuracy and performance trade-off of TCONV Layers is underexplored, with existing works evaluating down to 8-bit precision but not less. This research systematically evaluates the impact of very low precision when a two-layers quantized decoder, using TCONVs, is implemented within an FPGA-based System-on-Chip (SoC) architecture. We evaluate the quantization impact on throughput performance and hardware costs, as well as the impact of parallelizing the computations of TCONV Layers using the same metrics. Results show that, when 4-bit data are processed, the circuit implemented on a Xilinx Zynq-7020 SoC only uses similar to 15% of logic and similar to 7.5% of on-chip memories, at the expense of a negligible similar to 2.5% accuracy loss with respect to the 8-bit counterpart. Furthermore, 3.5x speed-up is observed when inputs are processed with 4x parallelism.
引用
收藏
页码:31 / 36
页数:6
相关论文
共 50 条
  • [1] Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks
    Motamedi, Mohammad
    Gysel, Philipp
    Akella, Venkatesh
    Ghiasi, Soheil
    2016 21ST ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016, : 575 - 580
  • [2] FPGA-based Accelerator for Losslessly Quantized Convolutional Neural Networks
    Sit, Mankit
    Kazami, Ryosuke
    Amano, Hideharu
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 295 - 298
  • [3] CASSE: A system-level modeling and design-space exploration tool for multiprocessor systems-on-chip
    Reyes, V
    Bautista, T
    Marrero, G
    Carballo, PP
    Kruijtzer, W
    PROCEEDINGS OF THE EUROMICRO SYSTEMS ON DIGITAL SYSTEM DESIGN, 2004, : 476 - 483
  • [4] Design Space Exploration of FPGA Accelerators for Convolutional Neural Networks
    Rahman, Atul
    Oh, Sangyun
    Lee, Jongeun
    Choi, Kiyoung
    PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 1147 - 1152
  • [5] Exploring the design-space for FPGA-based implementation of RSA
    Cilardo, A
    Mazzeo, A
    Romano, L
    Saggese, GP
    MICROPROCESSORS AND MICROSYSTEMS, 2004, 28 (04) : 183 - 191
  • [6] An Efficient FPGA-Based Dilated and Transposed Convolutional Neural Network Accelerator
    Wu, Tsung-Hsi
    Shu, Chang
    Liu, Tsung-Te
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (11) : 5178 - 5186
  • [7] Uni-OPU: An FPGA-Based Uniform Accelerator for Convolutional and Transposed Convolutional Networks
    Yu, Yunxuan
    Zhao, Tiandong
    Wang, Mingyu
    Wang, Kun
    He, Lei
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (07) : 1545 - 1556
  • [8] Latency-Driven Design for FPGA-based Convolutional Neural Networks
    Venieris, Stylianos I.
    Bouganis, Christos-Savvas
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [9] Optimisation of FPGA-Based Designs for Convolutional Neural Networks
    Bonifus, P. L.
    Thomas, Ann Mary
    Antony, Jobin K.
    SMART SENSORS MEASUREMENT AND INSTRUMENTATION, CISCON 2021, 2023, 957 : 209 - 221
  • [10] FPGA-Based Acceleration for Bayesian Convolutional Neural Networks
    Fan, Hongxiang
    Ferianc, Martin
    Que, Zhiqiang
    Liu, Shuanglong
    Niu, Xinyu
    Rodrigues, Miguel R. D.
    Luk, Wayne
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5343 - 5356