Exploring Memory Access Techniques for Efficient FPGA based 3D CNN Accelerator Design

被引:0
|
作者
Khan, Fatima Hameed [1 ]
Pasha, Muhammad Adeel [1 ]
Masud, Shahid [1 ]
机构
[1] Lahore Univ Management Sci LUMS, Dept Elect Engn, Lahore, Pakistan
关键词
3D CNNs; Parameterized Hardware Accelerator; FPGA; Memory Access Optimization; Systolic Architecture;
D O I
10.1109/AICAS59952.2024.10595963
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in 3D Convolutional Neural Network (CNN) architectures have demonstrated superior performance across diverse computer vision tasks, albeit with a trade-off of intense computational and memory demands. Thus, the tiling of incoming data becomes mandatory for 3D CNN acceleration in memory-constrained platforms such as Field Programmable Gate Arrays (FPGA). In this paper, different memory access techniques are explored to reduce the data traffic between on- chip and off-chip memories during the inference stage of a 3D CNN. The most suitable data traffic mode is identified by considering multiple parameters like latency, on-chip memory utilization and off-chip memory access. A parameterized and modular design approach for 3D CNNs has been implemented on an FPGA, where the input and weight data mapping modules are designed to minimize the onchip memory requirements. These modules are parameterized for variable tiling sizes and different memory access modes while the main computation is performed on a systolic-array-based pipelined architecture. The experiments conducted on three widely adopted 3D networks, I3D, C3D, and R(2+1)D, have shown 16%, 28%, and 10% improvement in latency respectively. The proposed methodology also results in a lower energy dissipation profile.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 50 条
  • [41] Emerging Reconfigurable Systems: Exploring 3D FPGA Architectures
    Salah, Khaled
    AbdelSalam, Mohamed
    2013 25TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS (ICM), 2013,
  • [42] Architecture and design of a hardware accelerator for efficient 3D object recognition using the LC method
    Hung, DL
    PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : A180 - A183
  • [43] HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices
    Toupas, Petros
    Montgomerie-Corcoran, Alexander
    Bouganis, Christos-Savvas
    Tzovaras, Dimitrios
    2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 144 - 154
  • [44] Spin orbit magnetic random access memory based binary CNN in-memory accelerator (BIMA) with sense amplifier
    Kalaichelvi, K.
    Sundaram, M.
    Sanmugavalli, P.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (01) : 137 - 148
  • [45] A Distributed and Parallel Accelerator Design for 3-D Acoustic Imaging on FPGA-Based Systems
    Zhao, Dongdong
    Mao, Weibo
    Chen, Peng
    Hu, Yingtian
    Liang, Haoran
    Dang, Yuanjie
    Liang, Ronghua
    Guo, Xinxin
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (05) : 1401 - 1414
  • [46] FPGA-based reliability testing and analysis for 3D NAND flash memory
    Wei, Debao
    Piao, Zhelong
    Feng, Hua
    Qiao, Liyan
    Peng, Xiyuan
    MICROELECTRONICS RELIABILITY, 2020, 114 (114)
  • [47] Implementation of 3D Graphics Accelerator using Full Pipeline Scheme on FPGA
    Kim, Kyungsu
    Hoosung-Lee
    Cho, Seonghyun
    Park, Seongmo
    ISOCC: 2008 INTERNATIONAL SOC DESIGN CONFERENCE, VOLS 1-3, 2008, : 536 - 539
  • [48] MobileFace: 3D Face Reconstruction with Efficient CNN Regression
    Chinaev, Nikolai
    Chigorin, Alexander
    Laptev, Ivan
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 15 - 30
  • [49] OPTNOC: An Optimized 3D Network-on-Chip Design for Fast Memory Access
    Xu, Thomas Canhao
    Liljeberg, Pasi
    Plosila, Juha
    Tenhunen, Hannu
    PARALLEL COMPUTING TECHNOLOGIES (PACT 2013), 2013, 7979 : 436 - 441
  • [50] Memory Efficient 3D Integral Volumes
    Urschler, Martin
    Bornik, Alexander
    Donoser, Michael
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, : 722 - 729