HISPE: High-Speed Configurable Floating-Point Multi-Precision Processing Element

被引:0
|
作者
Tejas, B. N. [1 ]
Bhatia, Rakshit [1 ]
Rao, Madhav [1 ]
机构
[1] IIIT Bangalore, Bangalore, Karnataka, India
关键词
Floating Point (FP); Processing Element (PE); TensorFloat-32 (TF32); BrainFloat-16 (BF16); High-Performance Computing (HPC); Multiply-Accumulate (MAC);
D O I
10.1109/ISQED60706.2024.10528733
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multiple precision modes are needed for a floating-point processing element (PE) because they provide flexibility in handling different types of numerical data with varying levels of precision and performance metrics. Performing high-precision floating-point operations has the benefits of producing highly precise and accurate results while allowing for a greater range of numerical representation. Conversely, low-precision operations offer faster computation speeds and lower power consumption. In this paper, we propose a configurable multi-precision processing element (PE) which supports Half Precision, Single Precision, Double Precision, BrainFloat-16 (BF-16) and TensorFloat-32 (TF-32). The design is realized using GPDK 45 nm technology and operated at 281.9 MHz clock frequency. The design was also implemented on Xilinx ZCU104 FPGA evaluation board. Compared with previous state-of-the-art (SOTA) multi-precision PEs, the proposed design supports two more floating point data formats namely BF-16 and TF-32. It achieves the best energy performance with 2368.91 GFLOPS/W and offers 63% improvement in operating frequency with comparable footprint and power metrics.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] A multi-precision floating-point adder
    Ozbilen, Metin Mete
    Gok, Mustafa
    PRIME: 2008 PHD RESEARCH IN MICROELECTRONICS AND ELECTRONICS, PROCEEDINGS, 2008, : 117 - 120
  • [2] Multi-precision binary multiplier architecture for multi-precision floating-point multiplication
    Tomar, Geetam Singh
    George, Marcus Llyode
    Tomar, Abhineet Singh
    IET CIRCUITS DEVICES & SYSTEMS, 2021, 15 (05) : 455 - 464
  • [3] Implementation of multi-precision floating point divider for high speed signal processing applications
    C. R. S. Hanuman
    J. Kamala
    A. R. Aruna
    The Journal of Supercomputing, 2019, 75 : 6038 - 6054
  • [4] Implementation of multi-precision floating point divider for high speed signal processing applications
    Hanuman, C. R. S.
    Kamala, J.
    Aruna, A. R.
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (09): : 6038 - 6054
  • [5] A Vector Systolic Accelerator for Multi-Precision Floating-Point High-Performance Computing
    Li, Kai
    Zhou, Junzhuo
    Li, Boyu
    Yang, Shuxing
    Huang, Sixiao
    Luo, Shaobo
    Mao, Wei
    Yu, Hao
    2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 226 - 229
  • [6] A Vector Systolic Accelerator for Multi-Precision Floating-Point High-Performance Computing
    Li, Kai
    Mao, Wei
    Zhou, Junzhuo
    Li, Boyu
    Yang, Zhengke
    Yang, Shuxing
    Du, Laimin
    Huang, Sixiao
    Yu, Hao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (10) : 4123 - 4127
  • [7] A Configurable Floating-Point Multiple-Precision Processing Element for HPC and AI Converged Computing
    Mao, Wei
    Li, Kai
    Cheng, Quan
    Dai, Liuyao
    Li, Boyu
    Xie, Xinang
    Li, He
    Lin, Longyang
    Yu, Hao
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (02) : 213 - 226
  • [8] A generator of high-speed floating-point modules
    Leyva, G
    Caffarena, G
    Carreras, C
    Nieto-Taladriz, O
    12TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2004, : 306 - 307
  • [9] FPC: A High-Speed Compressor for Double-Precision Floating-Point Data
    Burtscher, Martin
    Ratanaworabhan, Paruj
    IEEE TRANSACTIONS ON COMPUTERS, 2009, 58 (01) : 18 - 31
  • [10] FPGA implementation of the high-speed floating-point operation
    Ji, XS
    Wang, SR
    ICEMI 2005: Conference Proceedings of the Seventh International Conference on Electronic Measurement & Instruments, Vol 3, 2005, : 626 - 629