HISPE: High-Speed Configurable Floating-Point Multi-Precision Processing Element

被引：0

作者：

Tejas, B. N. ^{[1
]}

Bhatia, Rakshit ^{[1
]}

Rao, Madhav ^{[1
]}

机构：

[1] IIIT Bangalore, Bangalore, Karnataka, India

来源：

2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024 | 2024年

关键词：

Floating Point (FP); Processing Element (PE); TensorFloat-32 (TF32); BrainFloat-16 (BF16); High-Performance Computing (HPC); Multiply-Accumulate (MAC);

D O I：

10.1109/ISQED60706.2024.10528733

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multiple precision modes are needed for a floating-point processing element (PE) because they provide flexibility in handling different types of numerical data with varying levels of precision and performance metrics. Performing high-precision floating-point operations has the benefits of producing highly precise and accurate results while allowing for a greater range of numerical representation. Conversely, low-precision operations offer faster computation speeds and lower power consumption. In this paper, we propose a configurable multi-precision processing element (PE) which supports Half Precision, Single Precision, Double Precision, BrainFloat-16 (BF-16) and TensorFloat-32 (TF-32). The design is realized using GPDK 45 nm technology and operated at 281.9 MHz clock frequency. The design was also implemented on Xilinx ZCU104 FPGA evaluation board. Compared with previous state-of-the-art (SOTA) multi-precision PEs, the proposed design supports two more floating point data formats namely BF-16 and TF-32. It achieves the best energy performance with 2368.91 GFLOPS/W and offers 63% improvement in operating frequency with comparable footprint and power metrics.

引用

页数：8

共 50 条

[1] A multi-precision floating-point adder
Ozbilen, Metin Mete
Gok, Mustafa
PRIME: 2008 PHD RESEARCH IN MICROELECTRONICS AND ELECTRONICS, PROCEEDINGS, 2008, : 117 - 120
[2] Multi-precision binary multiplier architecture for multi-precision floating-point multiplication
Tomar, Geetam Singh
George, Marcus Llyode
Tomar, Abhineet Singh
IET CIRCUITS DEVICES & SYSTEMS, 2021, 15 (05) : 455 - 464
[3] Implementation of multi-precision floating point divider for high speed signal processing applications
C. R. S. Hanuman
J. Kamala
A. R. Aruna
The Journal of Supercomputing, 2019, 75 : 6038 - 6054
[4] Implementation of multi-precision floating point divider for high speed signal processing applications
Hanuman, C. R. S.
Kamala, J.
Aruna, A. R.
JOURNAL OF SUPERCOMPUTING, 2019, 75 (09): : 6038 - 6054
[5] A Vector Systolic Accelerator for Multi-Precision Floating-Point High-Performance Computing
Li, Kai
Zhou, Junzhuo
Li, Boyu
Yang, Shuxing
Huang, Sixiao
Luo, Shaobo
Mao, Wei
Yu, Hao
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 226 - 229
[6] A Vector Systolic Accelerator for Multi-Precision Floating-Point High-Performance Computing
Li, Kai
Mao, Wei
Zhou, Junzhuo
Li, Boyu
Yang, Zhengke
Yang, Shuxing
Du, Laimin
Huang, Sixiao
Yu, Hao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (10) : 4123 - 4127
[7] A Configurable Floating-Point Multiple-Precision Processing Element for HPC and AI Converged Computing
Mao, Wei
Li, Kai
Cheng, Quan
Dai, Liuyao
Li, Boyu
Xie, Xinang
Li, He
Lin, Longyang
Yu, Hao
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (02) : 213 - 226
[8] A generator of high-speed floating-point modules
Leyva, G
Caffarena, G
Carreras, C
Nieto-Taladriz, O
12TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2004, : 306 - 307
[9] FPC: A High-Speed Compressor for Double-Precision Floating-Point Data
Burtscher, Martin
Ratanaworabhan, Paruj
IEEE TRANSACTIONS ON COMPUTERS, 2009, 58 (01) : 18 - 31
[10] FPGA implementation of the high-speed floating-point operation
Ji, XS
Wang, SR
ICEMI 2005: Conference Proceedings of the Seventh International Conference on Electronic Measurement & Instruments, Vol 3, 2005, : 626 - 629

← 1 2 3 4 5 →