Exploring Memory Access Techniques for Efficient FPGA based 3D CNN Accelerator Design

被引:0
|
作者
Khan, Fatima Hameed [1 ]
Pasha, Muhammad Adeel [1 ]
Masud, Shahid [1 ]
机构
[1] Lahore Univ Management Sci LUMS, Dept Elect Engn, Lahore, Pakistan
关键词
3D CNNs; Parameterized Hardware Accelerator; FPGA; Memory Access Optimization; Systolic Architecture;
D O I
10.1109/AICAS59952.2024.10595963
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in 3D Convolutional Neural Network (CNN) architectures have demonstrated superior performance across diverse computer vision tasks, albeit with a trade-off of intense computational and memory demands. Thus, the tiling of incoming data becomes mandatory for 3D CNN acceleration in memory-constrained platforms such as Field Programmable Gate Arrays (FPGA). In this paper, different memory access techniques are explored to reduce the data traffic between on- chip and off-chip memories during the inference stage of a 3D CNN. The most suitable data traffic mode is identified by considering multiple parameters like latency, on-chip memory utilization and off-chip memory access. A parameterized and modular design approach for 3D CNNs has been implemented on an FPGA, where the input and weight data mapping modules are designed to minimize the onchip memory requirements. These modules are parameterized for variable tiling sizes and different memory access modes while the main computation is performed on a systolic-array-based pipelined architecture. The experiments conducted on three widely adopted 3D networks, I3D, C3D, and R(2+1)D, have shown 16%, 28%, and 10% improvement in latency respectively. The proposed methodology also results in a lower energy dissipation profile.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 50 条
  • [1] Exploration of Memory Access Optimization for FPGA-based 3D CNN Accelerator
    Tian, Teng
    Jin, Xi
    Zhao, Letian
    Wang, Xiaotian
    Wang, Jie
    Wu, Wei
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 1650 - 1655
  • [2] High Throughput CNN Accelerator Design Based on FPGA
    Xie, Liang
    Fan, Xitian
    Cao, Wei
    Wang, Lingli
    2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 277 - 280
  • [3] FPGA design and implementation of a matrix multiplier based accelerator for 3D EKF SLAM
    Tertei, Daniel Tortei
    Piat, Jonathan
    Devy, Michel
    2014 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2014,
  • [4] Memory Optimization Techniques for FPGA based CNN Implementations
    Shahshahani, Masoud
    Goswami, Pingakshya
    Bhatia, Dinesh
    PROCEEDINGS OF THE 2018 IEEE 13TH DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS), 2018,
  • [5] Design of high parallel CNN accelerator based on FPGA for AIoT
    Zhijian L.
    Xuewei G.
    Xiaopei C.
    Zhipeng Z.
    Xiaoyong D.
    Pingping C.
    Journal of China Universities of Posts and Telecommunications, 2022, 29 (05): : 1 - 9
  • [6] Design of high parallel CNN accelerator based on FPGA for AIoT
    Lin Zhijian
    Gao Xuewei
    Chen Xiaopei
    Zhu Zhipeng
    Du Xiaoyong
    Chen Pingping
    The Journal of China Universities of Posts and Telecommunications, 2022, 29 (05) : 1 - 9
  • [7] The Design of Lightweight and Multi Parallel CNN Accelerator Based on FPGA
    Li Zong-ling
    Wang Lu-yuan
    Yu Ji-yang
    Cheng Bo-wen
    Hao Liang
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 1521 - 1528
  • [8] FPGA design of EKF block accelerator for 3D visual SLAM
    Tertei, Daniel Tortei
    Piat, Jonathan
    Devy, Michel
    COMPUTERS & ELECTRICAL ENGINEERING, 2016, 55 : 123 - 137
  • [9] A Resource Efficient CNN Accelerator for Sensor Signal Processing Based on FPGA
    Wu, Ruidong
    Liu, Bing
    Fu, Ping
    Chen, Haolin
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (05)
  • [10] Low Power FPGA-SoC Design Techniques for CNN-based Object Detection Accelerator
    Kim, Heekyung
    Choi, Ken
    2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2019, : 1130 - 1134