Optimal Realization of Distributed Arithmetic-Based MAC Adaptive FIR Filter Architecture Incorporating Radix-4 and Radix-8 Computation

被引:2
|
作者
James, Britto Pari [1 ]
Leung, Man-Fai [2 ]
Vaithiyanathan, Dhandapani [3 ]
Mariammal, Karuthapandian [4 ]
机构
[1] Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci & T, Chennai 600062, India
[2] Anglia Ruskin Univ, Fac Sci & Engn, Sch Comp & Informat Sci, Cambridge CB1 1PT, England
[3] Natl Inst Technol Delhi, Delhi 110036, India
[4] Anna Univ, Madras Inst Technol, Chrompet 600044, Chennai, India
关键词
ARDA multiplier; multiply and accumulate (MAC); FIR filter; low area; high speed; FPGA IMPLEMENTATION; EFFICIENT; POWER;
D O I
10.3390/electronics13173551
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finite impulse response (FIR) filters are explicitly used in decisive applications such as communication and signal processing areas. Advancement in the latest technologies necessitates specific designs with optimal characteristics. This research work proposes the realization of an efficient distributed arithmetic adaptive FIR filter (DAAFA) architecture using radix-4 and radix-8 computation. Distributed arithmetic (DA) is extensively used to calculate the sum of products without involving a multiplier. The proposed fixed-point realization of a single multiply and accumulate (MAC) FIR adaptive filter is implemented with minimum complex design. The total longest-way computation time is a combination of the delay that occurred in the error calculation module and the delay involved in updating the filter weights. The longest-way computation time of the filter structure is higher, which results in increased latency. In addition, the approximate design of the radix DA multiplier structure is constructed using Booth recoding, partial product formation block and shifting-based accumulation block. Further, the approximate design of DA offers a reduction in complexity and area with respect to the number of slices and enhances the operating speed. The partial product is created using shifters and efficient adders, which further enhances the performance of the realization. This work is implemented in Xilinx and Altera devices and is compared with the present literature. From the synthesis results, it is observed that the propounded design outperforms in terms of complexity, slice delay product and ultimate speed of exertion. The suggested architecture was found to be decisive in terms of area, delay and complexity abatement. The results indicate that the propounded design achieves area reduction (slices) of about 92.03% compared to the existing design. Also, a speed enhancement of about 90.7% is accomplished for the proposed architecture. Nonetheless, the devised architecture utilizes the least means square approach, which enhances the convergence rate notably.
引用
收藏
页数:19
相关论文
共 8 条