Exploring Memory Access Techniques for Efficient FPGA based 3D CNN Accelerator Design

被引:0
|
作者
Khan, Fatima Hameed [1 ]
Pasha, Muhammad Adeel [1 ]
Masud, Shahid [1 ]
机构
[1] Lahore Univ Management Sci LUMS, Dept Elect Engn, Lahore, Pakistan
关键词
3D CNNs; Parameterized Hardware Accelerator; FPGA; Memory Access Optimization; Systolic Architecture;
D O I
10.1109/AICAS59952.2024.10595963
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in 3D Convolutional Neural Network (CNN) architectures have demonstrated superior performance across diverse computer vision tasks, albeit with a trade-off of intense computational and memory demands. Thus, the tiling of incoming data becomes mandatory for 3D CNN acceleration in memory-constrained platforms such as Field Programmable Gate Arrays (FPGA). In this paper, different memory access techniques are explored to reduce the data traffic between on- chip and off-chip memories during the inference stage of a 3D CNN. The most suitable data traffic mode is identified by considering multiple parameters like latency, on-chip memory utilization and off-chip memory access. A parameterized and modular design approach for 3D CNNs has been implemented on an FPGA, where the input and weight data mapping modules are designed to minimize the onchip memory requirements. These modules are parameterized for variable tiling sizes and different memory access modes while the main computation is performed on a systolic-array-based pipelined architecture. The experiments conducted on three widely adopted 3D networks, I3D, C3D, and R(2+1)D, have shown 16%, 28%, and 10% improvement in latency respectively. The proposed methodology also results in a lower energy dissipation profile.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 50 条
  • [31] 3D-VNPU: A Flexible Accelerator for 2D/3D CNNs on FPGA
    Deng, Huipeng
    Wang, Jian
    Ye, Huafeng
    Xiao, Shanlin
    Meng, Xiangyu
    Yu, Zhiyi
    2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 181 - 185
  • [32] Access devices for 3D crosspoint memory
    Burr, Geoffrey W.
    Shenoy, Rohit S.
    Virwani, Kumar
    Narayanan, Pritish
    Padilla, Alvaro
    Kurdi, Buelent
    Hwang, Hyunsang
    JOURNAL OF VACUUM SCIENCE & TECHNOLOGY B, 2014, 32 (04):
  • [33] Configurable 2D–3D CNNs Accelerator for FPGA-Based Hyperspectral Imagery Classification
    He, Wenjing
    Yang, Yuesong
    Mei, Shaohui
    Hu, Jian
    Xu, Wanqiu
    Hao, Shiqi
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 9833 - 9848
  • [34] An Integrated FPGA Accelerator for Deep Learning-Based 2D/3D Path Planning
    Sugiura, Keisuke
    Matsutani, Hiroki
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (06) : 1442 - 1456
  • [35] Design an Efficient FPGA-Based Accelerator for Leveled BFV Homomorphic Encryption
    Kong, Liang
    Qin, Guojie
    Li, Shuguo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (03) : 1381 - 1385
  • [36] Design of an Efficient CNN-Based Cough Detection System on Lightweight FPGA
    Peng, Peng
    Jiang, Kai
    You, Mingyu
    Xie, Jialin
    Zhou, Hongjun
    Xu, Weisheng
    Lu, Jicheng
    Li, Xiayu
    Xu, Yun
    IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2023, 17 (01) : 116 - 128
  • [37] A Memory-Optimized and Energy-Efficient CNN Acceleration Architecture Based on FPGA
    Chang, Xuepeng
    Pan, Huihui
    Zhang, Dun
    Sun, Qiming
    Lin, Weiyang
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2137 - 2141
  • [38] A Design of Stereoscopic 3D Video Processing System Based on FPGA 3D Formatter in case of FPR
    Sokullu, Radosveta
    Aydin, Mutlu
    2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 111 - 117
  • [39] An Edge 3D CNN Accelerator for Low-Power Activity Recognition
    Wang, Ying
    Wang, Yongchen
    Shi, Cong
    Cheng, Long
    Li, Huawei
    Li, Xiaowei
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (05) : 918 - 930
  • [40] Architecture and design of a hardware accelerator for efficient 3D object recognition using the LC method
    Hung, DL
    Hillesland, K
    Wang, J
    INFORMATION SCIENCES, 2001, 131 (1-4) : 1 - 18