Area-Efficient Distributed Arithmetic Optimization via Heuristic Decomposition and In-Memroy Computing

被引：5

作者：

Chen, Jian ^{[1
]}

Zhao, Wenfeng ^{[2
]}

Ha, Yajun ^{[1
]}

机构：

[1] Shanghaitech Univ, Sch Informat & Sci Technol, Shanghai, Peoples R China

[2] Univ Minnesota, Dept Biomed Engn, Minneapolis, MN USA

来源：

2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) | 2019年

关键词：

SRAM; distributed arithmetic; in-memory computing; FIR;

D O I：

10.1109/asicon47005.2019.8983659

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Distributed arithmetic (DA) is popularly adopted in many digital signal processing (DSP) applications, such as filtering, linear transformations and convolutions, with both area and energy benefits. DA utilizes Look-Up Tables (LUTs) that are implemented with SRAM to store all possible precomputed results. However, a direct implementation will lead to exponential LUT size increase with respect to the vector size. In this paper, we propose a novel in-memory computation design methodology to reduce the size of LUT without degrading the speed and power performance heavily. First, we propose a heuristic decomposition scheme that only leads to a minimal subset of the precomputed results to be stored in LUT. Second, we design a novel multi-bit in-memory adder exploiting charge-sharing based carry propagation. In the design case, when applying our method to the state-of-the-art DA-based FIR, the overall area is reduced by 10% while maintaining same speed and a similar level of energy.

引用

页数：4

共 50 条

[1] Area-efficient FIR filter design on FPGAs using distributed arithmetic
Longa, Patrick
Miri, Ali
[J]. 2006 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2006, : 248 - +
[2] Area-efficient and low latency architecture for high speed FIR filter using distributed arithmetic
Shanthi, Komatnani Govindan
Nagarajan, Nanjundan
[J]. ICIC Express Letters, Part B: Applications, 2015, 6 (08): : 2053 - 2058
[3] DIMCA: An Area-Efficient Digital In-Memory Computing Macro Featuring Approximate Arithmetic Hardware in 28 nm
Lin, Chuan-Tung
Wang, Dewei
Zhang, Bo
Chen, Gregory K.
Knag, Phil C.
Krishnamurthy, Ram Kumar
Seok, Mingoo
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (03) : 960 - 971
[4] Low Latency Area-Efficient Distributed Arithmetic Based Multi-Rate Filter Architecture for SDR Receivers
Agarwal, Ashok
Bopanna, Lakshmi
[J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2018, 27 (08)
[5] ASIC Implementation of Area-Efficient, High-Throughput 2-D IIR Filter Using Distributed Arithmetic
Kumar, Prashant
Shrivastava, Prabhat Chandra
Tiwari, Manish
Dhawan, Amit
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (07) : 2934 - 2957
[6] ASIC Implementation of Area-Efficient, High-Throughput 2-D IIR Filter Using Distributed Arithmetic
Prashant Kumar
Prabhat Chandra Shrivastava
Manish Tiwari
Amit Dhawan
[J]. Circuits, Systems, and Signal Processing, 2018, 37 : 2934 - 2957
[7] Design and Optimization of an Area-efficient SOT-MRAM
Wang, Chao
Wang, Zhaohao
Wu, Bi
Zhao, Weisheng
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
[8] A parallel distributed computing environment for decomposition optimization
Wei, RY
Wei, LY
Wang, QX
Tang, TB
Ren, P
[J]. OPTIMIZATION OF STRUCTURAL AND MECHANICAL SYSTEMS, PROCEEDINGS, 1999, : 538 - 545
[9] High-Throughput, Area-Efficient Architecture of 2-D Block FIR Filter Using Distributed Arithmetic Algorithm
Prashant Kumar
Prabhat Chandra Shrivastava
Manish Tiwari
Ganga Ram Mishra
[J]. Circuits, Systems, and Signal Processing, 2019, 38 : 1099 - 1113
[10] High-Throughput, Area-Efficient Architecture of 2-D Block FIR Filter Using Distributed Arithmetic Algorithm
Kumar, Prashant
Shrivastava, Prabhat Chandra
Tiwari, Manish
Mishra, Ganga Ram
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (03) : 1099 - 1113

← 1 2 3 4 5 →