IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators

被引：0

作者：

Wadhwa, Eashan ^{[1
]}

Khandelwal, Shashwat ^{[1
]}

Shreejith, Shanker ^{[1
]}

机构：

[1] Trinity Coll Dublin, Dept Elect & Elect Engn, Dublin, Ireland

来源：

2022 IEEE 33RD INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP) | 2022年

关键词：

Inference Algorithms; Field Programmable Gate Arrays; Convolution Neural Networks;

D O I：

10.1109/ASAP54787.2022.00027

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Quantised convolution neural networks (QCNNs) on FPGAs have shown tremendous potential for deploying deep learning on resource constrained devices closer to the data source or in embedded applications. An essential building block of (Q)CNNs are the convolutional layers. FPGA implementations use modified versions of convolution kernels to reduce the resource overheads using variations of the sliding kernel algorithm. While these alleviate resource consumption to a certain degree, they still incur considerable (distributed) memory resources, requiring the use of larger FPGA devices with sufficient on-chip memory elements to implement deep QCNNs. In this paper, we present the Inverse Memory Efficient Convolution (IMEC) algorithm, a novel strategy to lower the memory consumption of convolutional layers in QCNNs. IMEC lowers the footprint of intermediate matrix buffers incurred within the convolutional layers and the multiplyaccumulate (MAC) operators required at each layer through a series of data organisation and computational optimisations. We evaluate IMEC by integrating it into the BNN-PYNQ framework that can compile high-level QCNN representations to the FPGA bitstream. Our results show that IMEC can optimise memory footprint and the overall resource overhead of the convolutional layers by similar to 33% and similar to 20% (LUT and FF count) respectively, across multiple quantisation levels (1-bit to 8-bit), while maintaining identical inference accuracy as the state-of-the-art QCNN implementations.

引用

页码：115 / 121

页数：7

共 50 条

[1] MEC Memory-efficient Convolution for Deep Neural Network
Cho, Minsik
Brand, Daniel
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[2] MEPAD: A Memory-Efficient Parallelized Direct Convolution Algorithm for Deep Neural Networks
Fiorin, Leandro
Silvano, Cristina
EURO-PAR 2024: PARALLEL PROCESSING, PART II, EURO-PAR 2024, 2024, 14802 : 167 - 181
[3] Automated optimization for memory-efficient high-performance deep neural network accelerators
Kim, HyunMi
Lyuh, Chun-Gi
Kwon, Youngsu
ETRI JOURNAL, 2020, 42 (04) : 505 - 517
[4] A memory-efficient algorithm for network echo cancellation in VoIP systems
Ubale, A
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: AUDIO AND ELECTROACOUSTICS SIGNAL PROCESSING FOR COMMUNICATIONS, 2004, : 165 - 168
[5] A memory-efficient emptiness checking algorithm
Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030, China
J. Inf. Comput. Sci., 2006, 4 (803-810):
[6] A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training
Li, Dongsheng
Li, Shengwei
Lai, Zhiquan
Fu, Yongquan
Ye, Xiangyu
Cai, Lei
Qiao, Linbo
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (04) : 577 - 591
[7] A memory-efficient elitist genetic algorithm
Ahn, CW
Kim, KP
Ramakrishna, RS
PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 552 - 559
[8] A Memory-Efficient Data Redistribution Algorithm
Siegel, Stephen F.
Siegel, Andrew R.
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2009, 5759 : 219 - +
[9] A memory-efficient huffman decoding algorithm
Wang, PC
Yang, YR
Lee, CL
Chang, HY
AINA 2005: 19th International Conference on Advanced Information Networking and Applications, Vol 2, 2005, : 475 - 479
[10] vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design
Rhu, Minsoo
Gimelshein, Natalia
Clemons, Jason
Zulfiqar, Arslan
Keckler, Stephen W.
2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016,

← 1 2 3 4 5 →