New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and Inference

被引：25

作者：

Zhang, Hao ^{[1
]}

Chen, Dongdong ^{[2
]}

Ko, Seok-Bum ^{[1
]}

机构：

[1] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK S7N 5A2, Canada

[2] Intel Corp, San Jose, CA 95134 USA

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2020年 / 69卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Neural networks; Standards; Deep learning; Training; Hardware; Adders; Pipelines; Multiply-accumulate unit; multiple-precision arithmetic; flexible precision arithmetic; deep neural network computing; computer arithmetic; ADD;

D O I：

10.1109/TC.2019.2936192

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a new flexible multiple-precision multiply-accumulate (MAC) unit is proposed for deep neural network training and inference. The proposed MAC unit supports both fixed-point operations and floating-point operations. For floating-point format, the proposed unit supports one 16-bit MAC operation or sum of two 8-bit multiplications plus a 16-bit addend. To make the proposed MAC unit more versatile, the bit-width of exponent and mantissa can be flexibly exchanged. By setting the bit-width of exponent to zero, the proposed MAC unit also supports fixed-point operations. For fixed-point format, the proposed unit supports one 16-bit MAC or sum of two 8-bit multiplications plus a 16-bit addend. Moreover, the proposed unit can be further divided to support sum of four 4-bit multiplications plus a 16-bit addend. At the lowest precision, the proposed MAC unit supports accumulating of eight 1-bit logic AND operations to enable the support of binary neural networks. Compared to the standard 16-bit half-precision MAC unit, the proposed MAC unit provides more flexibility with only 21.8 percent area overhead. Compared to a standard 32-bit single-precision MAC unit, the proposed MAC unit requires much less hardware cost but still provides 8-bit exponent in the numerical format to maintain large dynamic range for deep learning computing.

引用

下载

页码：26 / 38

页数：13

共 50 条

[41] Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism
Mao, Yunlong
Hong, Wenbo
Zhu, Boyu
Zhu, Zhifei
Zhang, Yuan
Zhong, Sheng
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 3079 - 3091
[42] Double-Win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks via Random Precision Training and Inference
Fu, Yonggan
Yu, Qixuan
Li, Meng
Chandra, Vikas
Lin, Yingyan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[43] A new deep neural network algorithm for multiple stopping with applications in options pricing
Han, Yuecai
Li, Nan
COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2023, 117
[44] A new hybrid deep neural network for multiple sites PM2.5 forecasting
Teng, Mengfan
Li, Siwei
Yang, Jie
Chen, Jiarui
Fan, Chunying
Ding, Yu
JOURNAL OF CLEANER PRODUCTION, 2024, 473
[45] A Prototype of a Self-Motion Training System based on Deep Convolutional Neural Network and Multiple FAMirror
Baek, Ki Yeol
Kim, In Su
Jang, Jae Seok
Jung, Soon Ki
PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 296 - 301
[46] DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems with Flexible Topology
Zhou, Min
Chen, Minghua
Low, Steven H.
2023 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, PESGM, 2023,
[47] DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems With Flexible Topology
Zhou, Min
Chen, Minghua
Low, Steven H.
IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (01) : 964 - 967
[48] A Hybrid RRAM-SRAM Computing-In-Memory Architecture for Deep Neural Network Inference-Training Edge Acceleration
Feng, Jiayun
Wang, Yu
Hu, Xianwu
Wen, Gan
Wang, Zeming
Lin, Yukai
Wu, Danqing
Ma, Zizhao
Zhao, Liang
Lu, Zhichao
Xie, Yufeng
2021 SILICON NANOELECTRONICS WORKSHOP (SNW), 2021, : 65 - 66
[49] HIPU: A Hybrid Intelligent Processing Unit With Fine-Grained ISA for Real-Time Deep Neural Network Inference Applications
Zhao, Wenzhe
Yang, Guoming
Xia, Tian
Chen, Fei
Zheng, Nanning
Ren, Pengju
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (12) : 1980 - 1993
[50] Accelerating Deep Neural Network In-Situ Training With Non-Volatile and Volatile Memory Based Hybrid Precision Synapses
Luo, Yandong
Yu, Shimeng
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (08) : 1113 - 1127

← 1 2 3 4 5 →