New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and Inference

被引:25
|
作者
Zhang, Hao [1 ]
Chen, Dongdong [2 ]
Ko, Seok-Bum [1 ]
机构
[1] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK S7N 5A2, Canada
[2] Intel Corp, San Jose, CA 95134 USA
基金
加拿大自然科学与工程研究理事会;
关键词
Neural networks; Standards; Deep learning; Training; Hardware; Adders; Pipelines; Multiply-accumulate unit; multiple-precision arithmetic; flexible precision arithmetic; deep neural network computing; computer arithmetic; ADD;
D O I
10.1109/TC.2019.2936192
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a new flexible multiple-precision multiply-accumulate (MAC) unit is proposed for deep neural network training and inference. The proposed MAC unit supports both fixed-point operations and floating-point operations. For floating-point format, the proposed unit supports one 16-bit MAC operation or sum of two 8-bit multiplications plus a 16-bit addend. To make the proposed MAC unit more versatile, the bit-width of exponent and mantissa can be flexibly exchanged. By setting the bit-width of exponent to zero, the proposed MAC unit also supports fixed-point operations. For fixed-point format, the proposed unit supports one 16-bit MAC or sum of two 8-bit multiplications plus a 16-bit addend. Moreover, the proposed unit can be further divided to support sum of four 4-bit multiplications plus a 16-bit addend. At the lowest precision, the proposed MAC unit supports accumulating of eight 1-bit logic AND operations to enable the support of binary neural networks. Compared to the standard 16-bit half-precision MAC unit, the proposed MAC unit provides more flexibility with only 21.8 percent area overhead. Compared to a standard 32-bit single-precision MAC unit, the proposed MAC unit requires much less hardware cost but still provides 8-bit exponent in the numerical format to maintain large dynamic range for deep learning computing.
引用
下载
收藏
页码:26 / 38
页数:13
相关论文
共 50 条
  • [41] Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism
    Mao, Yunlong
    Hong, Wenbo
    Zhu, Boyu
    Zhu, Zhifei
    Zhang, Yuan
    Zhong, Sheng
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 3079 - 3091
  • [42] Double-Win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks via Random Precision Training and Inference
    Fu, Yonggan
    Yu, Qixuan
    Li, Meng
    Chandra, Vikas
    Lin, Yingyan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [43] A new deep neural network algorithm for multiple stopping with applications in options pricing
    Han, Yuecai
    Li, Nan
    COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2023, 117
  • [44] A new hybrid deep neural network for multiple sites PM2.5 forecasting
    Teng, Mengfan
    Li, Siwei
    Yang, Jie
    Chen, Jiarui
    Fan, Chunying
    Ding, Yu
    JOURNAL OF CLEANER PRODUCTION, 2024, 473
  • [45] A Prototype of a Self-Motion Training System based on Deep Convolutional Neural Network and Multiple FAMirror
    Baek, Ki Yeol
    Kim, In Su
    Jang, Jae Seok
    Jung, Soon Ki
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 296 - 301
  • [46] DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems with Flexible Topology
    Zhou, Min
    Chen, Minghua
    Low, Steven H.
    2023 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, PESGM, 2023,
  • [47] DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems With Flexible Topology
    Zhou, Min
    Chen, Minghua
    Low, Steven H.
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (01) : 964 - 967
  • [48] A Hybrid RRAM-SRAM Computing-In-Memory Architecture for Deep Neural Network Inference-Training Edge Acceleration
    Feng, Jiayun
    Wang, Yu
    Hu, Xianwu
    Wen, Gan
    Wang, Zeming
    Lin, Yukai
    Wu, Danqing
    Ma, Zizhao
    Zhao, Liang
    Lu, Zhichao
    Xie, Yufeng
    2021 SILICON NANOELECTRONICS WORKSHOP (SNW), 2021, : 65 - 66
  • [49] HIPU: A Hybrid Intelligent Processing Unit With Fine-Grained ISA for Real-Time Deep Neural Network Inference Applications
    Zhao, Wenzhe
    Yang, Guoming
    Xia, Tian
    Chen, Fei
    Zheng, Nanning
    Ren, Pengju
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (12) : 1980 - 1993
  • [50] Accelerating Deep Neural Network In-Situ Training With Non-Volatile and Volatile Memory Based Hybrid Precision Synapses
    Luo, Yandong
    Yu, Shimeng
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (08) : 1113 - 1127