A Reconfigurable Processing Element for Multiple-Precision Floating/Fixed-Point HPC

被引:1
|
作者
Li, Boyu [1 ]
Li, Kai [2 ]
Zhou, Jiajun [1 ]
Ren, Yuan [1 ]
Mao, Wei [2 ]
Yu, Hao [2 ]
Wong, Ngai [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Peoples R China
[2] Southern Univ Sci & Technol, Sch Microelect, Shenzhen 518055, Peoples R China
关键词
Multiple-precision; floating-point; fixed-point; PE; MAC; HPC; UNIT; ARCHITECTURE; ADD;
D O I
10.1109/TCSII.2023.3322259
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
High-performance computing (HPC) can facilitate deep neural network (DNN) training and inference. Previous works have proposed multiple-precision floating- and fixed-point designs, but most can only handle either one independently. This brief proposes a novel reconfigurable processing element (PE) supporting both energy-efficient floating-point and fixed-point multiply-accumulate (MAC) operations. This PE can support $9\times $ BFloat16 (BF16), 4 $\times $ half-precision (FP16), $4\times $ TensorFloat-32 (TF32) and $1\times $ single-precision (FP32) MAC operation with 100% multiplication hardware utilization in one clock cycle. Besides, it can also support 72 $\times $ INT2, 36 $\times $ INT4 and 9 $\times $ INT8 dot product plus one 32-bit addend. The design is realized in a 28nm-process at a 1.471GHz slow-corner clock frequency. Compared with state-of-the-art (SOTA) multiple-precision PEs, the proposed work exhibits the best energy efficiency of 834.35GFLOPS/W and 1761.41GFLOPS/W at TF32 and BF16 with at least 10 $\times $ and 4 $\times $ improvement, respectively, for deep learning training. Meanwhile, this design supports energy-efficient fixed-point computing with a small hardware overhead for deep learning inference.
引用
收藏
页码:1401 / 1405
页数:5
相关论文
共 50 条
  • [21] VFloat: A Variable Precision Fixed- and Floating-Point Library for Reconfigurable Hardware
    Wang, Xiaojun
    Leeser, Miriam
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2010, 3 (03)
  • [22] A new architecture for multiple-precision floating-point multiply-add fused unit design
    Huang, Libo
    Shen, Li
    Dai, Kui
    Wang, Zhiying
    18TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2007, : 69 - +
  • [23] STOCHASTIC MODELING FOR FLOATING-POINT TO FIXED-POINT CONVERSION
    Banciu, Andrei
    Casseau, Emmanuel
    Menard, Daniel
    Michel, Thierry
    2011 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2011, : 180 - 185
  • [24] Floating-point DSP extends fixed-point architecture
    Myrvaagnes, R
    ELECTRONIC PRODUCTS MAGAZINE, 1998, 41 (04): : 26 - 26
  • [25] An automated floating-point to fixed-point conversion methodology
    Shi, CC
    Brodersen, RW
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 529 - 532
  • [26] Computing floating-point logarithms with fixed-point operations
    Le Maire, Julien
    Brunie, Nicolas
    de Dinechin, Florent
    Muller, Jean-Michel
    2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH), 2016, : 156 - 163
  • [27] An introduction to fixed-point signal processing
    Linebarger, DA
    Bryan, TA
    IEEE 11TH DIGITAL SIGNAL PROCESSING WORKSHOP & 2ND IEEE SIGNAL PROCESSING EDUCATION WORKSHOP, 2004, : 19 - 23
  • [28] A methodology for evaluating the precision of fixed-point systems
    Menard, D
    Sentieys, O
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3152 - 3155
  • [29] Energy-efficiency of floating-point and fixed-point SIMD cores for MIMO processing systems
    Guenther, D.
    Bytyn, A.
    Leupers, R.
    Ascheid, G.
    2014 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP (SOC), 2014,
  • [30] A Dynamically Reconfigurable Platform for Fixed-Point FIR Filters
    Llamocca, Daniel
    Pattichis, Marios
    Vera, G. Alonzo
    2009 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS, 2009, : 332 - +