Radix-64 Floating-Point Division and Square Root: Iterative and Pipelined Units

被引:2
|
作者
Bruguera, Javier D. [1 ]
机构
[1] Arm Ltd, Cambridge CB1 9NJ, England
关键词
Table lookup; Mathematical models; Low latency communication; Iterative algorithms; Program processors; Hardware; Timing; Digit-recurrence algorithms; floating-point division and square root; iterative methods;
D O I
10.1109/TC.2023.3280136
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. Traditionally, commercial processors have iterative division and square root units where the iteration logic is used over several cycles. The main drawbacks of these iterative units are long latency and low throughput due to the reuse of part of the logic over several cycles, and its hardware complexity with separated logic for division and square root. We present a radix-64 floating-point division and square root algorithm with a common iteration for division and square root and where, to have an affordable implementation, each radix-64 iteration is made of two simpler radix-8 iterations. The radix-64 algorithm allows to get low-latency operations, and the common division and square root radix-64 iteration results in some area reduction. The algorithm is mapped into two different microarchitectures: a low-latency and low area iterative unit, and a low-latency and high-throughput pipelined unit. In both units speculation between consecutive radix-8 iterations is used to reduce the timing.
引用
收藏
页码:2990 / 3001
页数:12
相关论文
共 50 条
  • [1] Radix-64 Floating-Point Divider
    Bruguera, Javier D.
    2018 IEEE 25TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2018, : 84 - 91
  • [2] Low-Latency and High-Bandwidth Pipelined Radix-64 Division and Square Root Unit
    Bruguera, Javier D.
    2022 IEEE 29TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH 2022), 2022, : 10 - 17
  • [3] Scalable pipeline insertion in floating-point division and square root units
    Ortiz, I., IEEE Circuits and Systems Society; Hiroshima University (Institute of Electrical and Electronics Engineers Inc.):
  • [4] Scalable pipeline insertion in floating-point division and square root units
    Ortiz, I
    Jimenez, M
    2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, CONFERENCE PROCEEDINGS, 2004, : 225 - 228
  • [5] SIMPLIFIED FLOATING-POINT DIVISION AND SQUARE ROOT
    Viitanen, Timo
    Jaaskelainen, Pekka
    Esko, Otto
    Takala, Jarmo
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 2707 - 2711
  • [6] Low Latency Floating-Point Division and Square Root Unit
    Bruguera, Javier D.
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (02) : 274 - 287
  • [7] A novel implementation of radix-4 floating-point division/square-root using comparison multiples
    Nikmehr, H.
    Phillips, B.
    Lim, C. C.
    COMPUTERS & ELECTRICAL ENGINEERING, 2010, 36 (05) : 850 - 863
  • [8] Tradeoffs of designing floating-point division and square root on virtex FPGAs
    Wang, XJ
    Nelson, BE
    FCCM 2003: 11TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2003, : 195 - 203
  • [9] A VLSI MODULE FOR IEEE FLOATING-POINT MULTIPLICATION DIVISION SQUARE ROOT
    LU, PY
    DAWALLU, K
    PROCEEDINGS - IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN : VLSI IN COMPUTERS & PROCESSORS, 1989, : 366 - 368
  • [10] A Low-Cost High Radix Floating-Point Square-Root Circuit
    Yang, Yuheng
    Yuan, Qing
    Liu, Jian
    ELECTRONICS, 2021, 10 (16)