FeatherNet: An Accelerated Convolutional Neural Network Design for Resource-constrained FPGAs

被引:8
|
作者
Morcel, Raghid [1 ]
Hajj, Hazem M. [1 ]
Saghir, Mazen A. R. [1 ]
Akkary, Haitham [1 ]
Artail, Hassan [1 ]
Khanna, Rahul [2 ]
Keshavamurthy, Anil [2 ]
机构
[1] Amer Univ Beirut, POB 11-0236, Beirut 11072020, Lebanon
[2] Intel Corp, Hillsboro, OR USA
关键词
Convolutional neural networks; embedded-vision; IoT applications; resource-constrained FPGAs;
D O I
10.1145/3306202
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Network (ConvNet or CNN) algorithms are characterized by a large number of model parameters and high computational complexity. These two requirements have made it challenging for implementations on resource-limited FPGAs. The challenges are magnified when considering designs for low-end FPGAs. While previous work has demonstrated successful ConvNet implementations with high-end FPGAs, this article presents a ConvNet accelerator design that enables the implementation of complex deep ConvNet architectures on resource-constrained FPGA platforms aimed at the IoT market. We call the design "FeatherNet" for its light resource utilization. The implementations are VHDL-based providing flexibility in design optimizations. As part of the design process, newmethods are introduced to address several design challenges. The first method is a novel stride-aware graph-based method targeted at ConvNets that aims at achieving efficient signal processing with reduced resource utilization. The second method addresses the challenge of determining the minimal precision arithmetic needed while preserving high accuracy. For this challenge, we propose variable-width dynamic fixed-point representations combined with a layer-by-layer design-space pruning heuristic across the different layers of the deep ConvNet model. The third method aims at achieving a modular design that can support different types of ConvNet layers while ensuring low resource utilization. For this challenge, we propose the modules to be relatively small and composed of computational filters that can be interconnected to build an entire accelerator design. These model elements can be easily configured through HDL parameters (e.g., layer type, mask size, stride, etc.) to meet the needs of specific ConvNet implementations and thus they can be reused to implement a wide variety of ConvNet architectures. The fourth method addresses the challenge of design portability between two different FPGA vendor platforms, namely, Intel/Altera and Xilinx. For this challenge, we propose to instantiate the device-specific hardware blocks needed in each computational filter, rather than relying on the synthesis tools to infer these blocks, while keeping track of the similarities and differences between the two platforms. We believe that the solutions to these design challenges further advance knowledge as they can benefit designers and other researchers using similar devices or facing similar challenges. Our results demonstrated the success of addressing the design challenges and achieving low (30%) resource utilization for the low-end FPGA platforms: Zedboard and Cyclone V. The design overcame the limitation of designs targeted for high-end platforms and that cannot fit on low-end IoT platforms. Furthermore, our design showed superior performance results (measured in terms of [Frame/s/W] per Dollar) compared to high-end optimized designs.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Spatially Invariant Convolutional Spiking Neural Network For Resource-Constrained IoT Devices
    Yadav, Chetali
    Reniwal, Bhupendra Singh
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025, : 3005 - 3026
  • [2] A convolutional neural network for the resource-constrained project scheduling problem (RCPSP): A new approach
    Golab, Amir
    Gooya, Ehsan Sedgh
    Al Falou, Ayman
    Cabon, Mikael
    DECISION SCIENCE LETTERS, 2023, 12 (02) : 225 - 238
  • [3] A Runtime Switchable Multi-Phase Convolutional Neural Network for Resource-Constrained Systems
    Jang, Jeonggyu
    Yang, Hoeseok
    IEEE ACCESS, 2023, 11 : 62449 - 62461
  • [4] Slim and Efficient Neural Network Design for Resource-Constrained SAR Target Recognition
    Chen, Hongyi
    Zhang, Fan
    Tang, Bo
    Yin, Qiang
    Sun, Xian
    REMOTE SENSING, 2018, 10 (10)
  • [5] T-Net: A Resource-Constrained Tiny Convolutional Neural Network for Medical Image Segmentation
    Khan, Tariq M.
    Robles-Kelly, Antonio
    Naqvi, Syed S.
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1799 - 1808
  • [6] High Accuracy and Low Latency Mixed Precision Neural Network Acceleration for TinyML Applications on Resource-Constrained FPGAs
    Ng, Wei Soon
    Goh, Wang Ling
    Gao, Yuan
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [7] Optimized Distribution of an Accelerated Convolutional Neural Network across Multiple FPGAs
    Maarouf, Alaa
    El Droubi, Nour
    Morcel, Raghid
    Hajj, Hazem
    Saghir, Mazen A. R.
    Akkary, Haitham
    28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 235 - 235
  • [8] Sparse convolutional neural network acceleration with lossless input feature map compression for resource-constrained systems
    Kwon, Jisu
    Kong, Joonho
    Munir, Arslan
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2022, 16 (01): : 29 - 43
  • [9] Planet Optimization with Deep Convolutional Neural Network for Lightweight Intrusion Detection in Resource-Constrained IoT Networks
    Alissa, Khalid A.
    Alrayes, Fatma S.
    Tarmissi, Khaled
    Yafoz, Ayman
    Alsini, Raed
    Alghushairy, Omar
    Othman, Mahmoud
    Motwakel, Abdelwahed
    APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [10] Secure Neural Network Inference as a Service with Resource-Constrained Clients
    de Vries, Rik
    Mann, Zoltan Adam
    16TH IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING, UCC 2023, 2023,