Area-Efficient Distributed Arithmetic Optimization via Heuristic Decomposition and In-Memroy Computing

被引:5
|
作者
Chen, Jian [1 ]
Zhao, Wenfeng [2 ]
Ha, Yajun [1 ]
机构
[1] Shanghaitech Univ, Sch Informat & Sci Technol, Shanghai, Peoples R China
[2] Univ Minnesota, Dept Biomed Engn, Minneapolis, MN USA
关键词
SRAM; distributed arithmetic; in-memory computing; FIR;
D O I
10.1109/asicon47005.2019.8983659
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Distributed arithmetic (DA) is popularly adopted in many digital signal processing (DSP) applications, such as filtering, linear transformations and convolutions, with both area and energy benefits. DA utilizes Look-Up Tables (LUTs) that are implemented with SRAM to store all possible precomputed results. However, a direct implementation will lead to exponential LUT size increase with respect to the vector size. In this paper, we propose a novel in-memory computation design methodology to reduce the size of LUT without degrading the speed and power performance heavily. First, we propose a heuristic decomposition scheme that only leads to a minimal subset of the precomputed results to be stored in LUT. Second, we design a novel multi-bit in-memory adder exploiting charge-sharing based carry propagation. In the design case, when applying our method to the state-of-the-art DA-based FIR, the overall area is reduced by 10% while maintaining same speed and a similar level of energy.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Area-efficient FIR filter design on FPGAs using distributed arithmetic
    Longa, Patrick
    Miri, Ali
    [J]. 2006 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2006, : 248 - +
  • [2] Area-efficient and low latency architecture for high speed FIR filter using distributed arithmetic
    Shanthi, Komatnani Govindan
    Nagarajan, Nanjundan
    [J]. ICIC Express Letters, Part B: Applications, 2015, 6 (08): : 2053 - 2058
  • [3] DIMCA: An Area-Efficient Digital In-Memory Computing Macro Featuring Approximate Arithmetic Hardware in 28 nm
    Lin, Chuan-Tung
    Wang, Dewei
    Zhang, Bo
    Chen, Gregory K.
    Knag, Phil C.
    Krishnamurthy, Ram Kumar
    Seok, Mingoo
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (03) : 960 - 971
  • [4] Low Latency Area-Efficient Distributed Arithmetic Based Multi-Rate Filter Architecture for SDR Receivers
    Agarwal, Ashok
    Bopanna, Lakshmi
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2018, 27 (08)
  • [5] ASIC Implementation of Area-Efficient, High-Throughput 2-D IIR Filter Using Distributed Arithmetic
    Kumar, Prashant
    Shrivastava, Prabhat Chandra
    Tiwari, Manish
    Dhawan, Amit
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (07) : 2934 - 2957
  • [6] ASIC Implementation of Area-Efficient, High-Throughput 2-D IIR Filter Using Distributed Arithmetic
    Prashant Kumar
    Prabhat Chandra Shrivastava
    Manish Tiwari
    Amit Dhawan
    [J]. Circuits, Systems, and Signal Processing, 2018, 37 : 2934 - 2957
  • [7] Design and Optimization of an Area-efficient SOT-MRAM
    Wang, Chao
    Wang, Zhaohao
    Wu, Bi
    Zhao, Weisheng
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
  • [8] A parallel distributed computing environment for decomposition optimization
    Wei, RY
    Wei, LY
    Wang, QX
    Tang, TB
    Ren, P
    [J]. OPTIMIZATION OF STRUCTURAL AND MECHANICAL SYSTEMS, PROCEEDINGS, 1999, : 538 - 545
  • [9] High-Throughput, Area-Efficient Architecture of 2-D Block FIR Filter Using Distributed Arithmetic Algorithm
    Prashant Kumar
    Prabhat Chandra Shrivastava
    Manish Tiwari
    Ganga Ram Mishra
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 1099 - 1113
  • [10] High-Throughput, Area-Efficient Architecture of 2-D Block FIR Filter Using Distributed Arithmetic Algorithm
    Kumar, Prashant
    Shrivastava, Prabhat Chandra
    Tiwari, Manish
    Mishra, Ganga Ram
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (03) : 1099 - 1113