An improved architecture for bit-level matrix multiplication

被引:0
|
作者
Grover, RS [1 ]
Shang, WJ [1 ]
Li, Q [1 ]
机构
[1] Santa Clara Univ, Dept Comp Engn, Santa Clara, CA 95053 USA
关键词
bit-level matrix multiplication; FPGA array; mapping algorithms to hardware; reconfigurable computing;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel bit-level architecture where each processing element does a simple operation of adding three to six bits to generate one partial sum bit and one to two carryout bits. We gain speedup over word-level because individual bits of a word do not have to be processed as a unit in a bit-level architecture. In [1], two bit-level architectures for fixed point matrix multiplication are proposed that are O(log p) times faster than the fastest word-level architecture where p is the word length. The architecture presented in this paper is even faster than the two in [1] by breaking the critical path in the dependence graph into half: We show basic ideas of how to gain speedup in our design, how to establish the dependence structure and how to derive the final design. We also show our design is time optimal for our dependence structure and has a speedup of 50% or more over the designs presented in [1]. We are implementing the design on a Xilinx FPGA chip, which shows a potential speedup over Xilinx multiplier macro. Our approach can be used to map algorithms to hardware.
引用
收藏
页码:2257 / 2264
页数:8
相关论文
共 50 条
  • [1] Bit-level two's complement matrix multiplication
    Grover, RS
    Shang, WJ
    Li, Q
    INTEGRATION-THE VLSI JOURNAL, 2002, 33 (1-2) : 3 - 21
  • [2] BIT-LEVEL SYSTOLIC ARRAY CIRCUIT FOR MATRIX VECTOR MULTIPLICATION
    MCCANNY, JV
    MCWHIRTER, JG
    IEE PROCEEDINGS-G CIRCUITS DEVICES AND SYSTEMS, 1983, 130 (04): : 125 - 130
  • [3] AN IMPROVED BIT-LEVEL SYSTOLIC ARCHITECTURE FOR IIR FILTERING
    KNOWLES, SC
    MCWHIRTER, JG
    SYSTOLIC ARRAY PROCESSORS, 1989, : 205 - 214
  • [4] Bit-level architectures for Montgomery's multiplication
    Nibouche, O
    Bouridane, A
    Nibouche, M
    ICECS 2001: 8TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS I-III, CONFERENCE PROCEEDINGS, 2001, : 273 - 276
  • [5] Bit-level parallel array algorithms of vector-vector and matrix-matrix multiplication
    Guo Li
    Wang Miao-Feng
    Qiu Tian
    Liu Lu
    Luo Feng
    2006 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLS 1-4: VOL 1: SIGNAL PROCESSING, 2006, : 567 - +
  • [6] Image encryption scheme with bit-level scrambling and multiplication diffusion
    Li, Chun-Lai
    Zhou, Yang
    Li, Hong-Min
    Feng, Wei
    Du, Jian-Rong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (12) : 18479 - 18501
  • [7] Bit-Level Optimized Constant Multiplication Using Boolean Satisfiability
    Fiege, Nicolai
    Kumm, Martin
    Zipf, Peter
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (01) : 249 - 261
  • [8] Image encryption scheme with bit-level scrambling and multiplication diffusion
    Chun-Lai Li
    Yang Zhou
    Hong-Min Li
    Wei Feng
    Jian-Rong Du
    Multimedia Tools and Applications, 2021, 80 : 18479 - 18501
  • [9] Optimize Dataflow of DNN on Bit-Level Composable Architecture
    Gao, Hanyuan
    Gong, Lei
    Wang, Teng
    Computer Engineering and Applications, 60 (18): : 147 - 157
  • [10] A bit-level pipelined VLSI architecture for the running order algorithm
    Chen, CT
    Chen, LG
    Hsiao, JH
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (08) : 2140 - 2144