Radix-64 Floating-Point Division and Square Root: Iterative and Pipelined Units

被引:2
|
作者
Bruguera, Javier D. [1 ]
机构
[1] Arm Ltd, Cambridge CB1 9NJ, England
关键词
Table lookup; Mathematical models; Low latency communication; Iterative algorithms; Program processors; Hardware; Timing; Digit-recurrence algorithms; floating-point division and square root; iterative methods;
D O I
10.1109/TC.2023.3280136
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. Traditionally, commercial processors have iterative division and square root units where the iteration logic is used over several cycles. The main drawbacks of these iterative units are long latency and low throughput due to the reuse of part of the logic over several cycles, and its hardware complexity with separated logic for division and square root. We present a radix-64 floating-point division and square root algorithm with a common iteration for division and square root and where, to have an affordable implementation, each radix-64 iteration is made of two simpler radix-8 iterations. The radix-64 algorithm allows to get low-latency operations, and the common division and square root radix-64 iteration results in some area reduction. The algorithm is mapped into two different microarchitectures: a low-latency and low area iterative unit, and a low-latency and high-throughput pipelined unit. In both units speculation between consecutive radix-8 iterations is used to reduce the timing.
引用
收藏
页码:2990 / 3001
页数:12
相关论文
共 50 条
  • [41] Self Checking in Current Floating-Point Units
    Lipetz, Daniel
    Schwarz, Eric
    2011 20TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH-20), 2011, : 73 - 76
  • [42] RISC VS CISC FLOATING-POINT UNITS
    JACKSON, DC
    COMPUTER DESIGN, 1988, 27 (08): : 42 - 42
  • [43] Floating-Point Inverse Square Root Algorithm Based on Taylor-Series Expansion
    Wei, Jianglin
    Kuwana, Anna
    Kobayashi, Haruo
    Kubo, Kazuyoshi
    Tanaka, Yuuki
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (07) : 2640 - 2644
  • [44] Floating-Point Exponentiation Units for Reconfigurable Computing
    de Dinechin, Florent
    Echeverria, Pedro
    Lopez-Vallejo, Marisa
    Pasca, Bogdan
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2013, 6 (01)
  • [45] A PIPELINED INTERFACE FOR HIGH FLOATING-POINT PERFORMANCE WITH PRECISE EXCEPTIONS
    IACOBOVICI, S
    IEEE MICRO, 1988, 8 (03) : 77 - 87
  • [46] Design and implementation of double precision floating point division and square root on FPGAs
    Thakkar, Anuja J.
    Ejnioui, Abdel
    2006 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2006, : 2489 - +
  • [47] Floating-Point Division and Square Root Implementation using a Taylor-Series Expansion Algorithm with Reduced Look-up Tables
    Kwon, Taek-Jun
    Draper, Jeff
    2008 51ST MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1 AND 2, 2008, : 954 - 957
  • [48] High-radix implementation of IEEE floating-point addition
    Seidel, PM
    17TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2005, : 99 - 106
  • [49] A MONOLITHIC 64-BIT FLOATING-POINT COPROCESSOR
    TAKLA, N
    HECKER, M
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1984, 19 (04) : 538 - 539
  • [50] RADIX CONVERSION FOR IEEE754-2008 MIXED RADIX FLOATING-POINT ARITHMETIC
    Kupriianova, Olga
    Lauter, Christoph
    Muller, Jean-Michel
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 1134 - 1138