Hardware efficient fast DCT based on novel cyclic convolution structures

被引：36

作者：

Cheng, Chao ^{[1
]}

Parhi, Keshab K. ^{[1
]}

机构：

[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2006年 / 54卷 / 11期

基金：

美国国家科学基金会;

关键词：

cyclic convolution; discrete cosine transforms; linear convolution; very large-scale integration;

D O I：

10.1109/TSP.2006.881269

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Cyclic convolution is a widely used operation in signal processing. In very large-scale integration (VLSI) design, it is usually implemented with systolic array and distributed arithmetic; however, these implementation designs may not be fast enough or use too much hardware cost when the convolution length is large. This paper presents a new fast cyclic convolution algorithm, which is hardware efficient and suitable for high-speed VLSI implementation, especially when the convolution length is large. For example, when the proposed fast cyclic convolution algorithm is applied to the implementation of prime length discrete cosine transform (DCT), the proposed high-throughput implementation of 1297-length DCT design saves 1216 (94%) multiplications, 282 (22%) additions, and 4792 (74%) delay elements compared with those of recently proposed systolic array based algorithms. Furthermore, the proposed algorithm can run at a speed that is 1.5 times that of previous designs and requires less I/O cost as long as the wordlength L is less than 20 bits.

引用

页码：4419 / 4434

页数：16

共 50 条

[41] Fast and Efficient Hardware Implementation of HQC
Deshpande, Sanjay
Xu, Chuanqi
Nawan, Mamuri
Nawaz, Kashif
Szefer, Jakub
[J]. SELECTED AREAS IN CRYPTOGRAPHY - SAC 2023, 2024, 14201 : 297 - 321
[42] Efficient VLSI implementations of fast multiplierless approximated DCT using parameterized hardware modules for silicon intellectual property designlk
Hsiao, SF
Hu, YH
Juang, TB
Lee, CH
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2005, 52 (08) : 1568 - 1579
[43] Generalization of the cyclic convolution and its fast computational systems
Murakami, H
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2000, E83A (12) : 2743 - 2746
[44] An optimal adder-based hardware architecture for the DCT/SA-DCT
Kinane, A
Muresan, V
O'Connor, N
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4, 2005, 5960 : 1410 - 1417
[45] Efficient and Accurate Pattern Synthesis of Circular Antenna Array Employing Iterative Fast Segmented Cyclic Convolution
Sun, Jialu
Liu, Yanhui
Luo, Qianke
Ren, Yi
Guo, Yingjie Jay
[J]. IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2024, 23 (01): : 269 - 273
[46] A NOVEL FAST DCT COEFFICIENT SCAN ARCHITECTURE
An, Da
Tong, Xin
Zhu, Bingqiang
He, Yun
[J]. PCS: 2009 PICTURE CODING SYMPOSIUM, 2009, : 273 - 276
[47] A novel derivation of the Agarwal-Cooley fast cyclic convolution algorithm based on the Good-Thomas Prime Factor algorithm
Teixeira, M
Rodriguez, D
[J]. 38TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 89 - 91
[48] A novel superscalar architecture for fast DCT implementation
Yong, Z
Zhang, M
[J]. PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2000, 1800 : 171 - 177
[49] A Hardware Efficient Technique for Linear Convolution of Finite Length Sequences
Mookherjee, Soumak
DeBrunner, Linda S.
DeBrunner, Victor
[J]. 2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 515 - 519
[50] Hardware Efficient Convolution Processing Unit for Deep Neural Networks
Hazarika, Anakhi
Poddar, Soumyajit
Rahaman, Hafizur
[J]. 2019 2ND INTERNATIONAL SYMPOSIUM ON DEVICES, CIRCUITS AND SYSTEMS (ISDCS 2019), 2019,

← 1 2 3 4 5 →