UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

被引：212

作者：

Lee, Jinmook ^{[1
]}

Kim, Changhyeon ^{[1
]}

Kang, Sanghoon ^{[1
]}

Shin, Dongjoo ^{[1
]}

Kim, Sangyeob ^{[1
]}

Yoo, Hoi-Jun ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South Korea

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2019年 / 54卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Bit serial; deep learning; deep learning ASIC; deep learning hardware; deep neural network (DNN); mobile deep learning;

D O I：

10.1109/JSSC.2018.2865489

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

An energy-efficient deep neural network (DNN) accelerator, unified neural processing unit (UNPU), is proposed for mobile deep learning applications. The UNPU can support both convolutional layers (CLs) and recurrent or fully connected layers (FCLs) to support versatile workload combinations to accelerate various mobile deep learning applications. In addition, the UNPU is the first DNN accelerator ASIC that can support fully variable weight bit precision from 1 to 16 bit. It enables the UNPU to operate on the accuracy-energy optimal point. Moreover, the lookup table (LUT)-based bit-serial processing element (LBPE) in the UNPU achieves the energy consumption reduction compared to the conventional fixed-point multiply-and-accumulate (MAC) array by 23.1%, 27.2%, 41%, and 53.6% for the 16-, 8-, 4-, and 1-bit weight precision, respectively. Besides the energy efficiency improvement, the unified DNN core architecture of the UNPU improves the peak performance for CL by 1.15x compared to the previous work. It makes the UNPU operate on the lower voltage and frequency for the given DNN to increase energy efficiency. The UNPU is implemented in 65-nm CMOS technology and occupies the 4 x4 mm(2) die area. The UNPU can operates from 0.63-to 1.1-V supply voltage with maximum frequency of 200 MHz. The UNPU has peak performance of 345.6 GOPS for 16-bit weight precision and 7372 GOPS for 1-bit weight precision. The wide operating range of UNPU makes the UNPU achieve the power efficiency of 3.08 TOPS/W for 16-bit weight precision and 50.6 TOPS/W for 1-bit weight precision. The functionality of the UNPU is successfully demonstrated on the verification system using ImageNet deep CNN (VGG-16).

引用

页码：173 / 185

页数：13

共 50 条

[1] UNPU: A 50.6TOPS/W Unified Deep Neural Network Accelerator with 1b-to-16b Fully-Variable Weight Bit-Precision
Lee, Jinmook
Kim, Changhyeon
Kang, Sanghoon
Shin, Dongjoo
Kim, Sangyeob
Yoo, Hoi-Jun
[J]. 2018 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE - (ISSCC), 2018, : 218 - +
[2] BitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks
Ryu, Sungju
Kim, Hyungjun
Yi, Wooseok
Kim, Eunhwan
Kim, Yulhwa
Kim, Taesu
Kim, Jae-Joon
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (06) : 1924 - 1935
[3] An Energy-Efficient Deep Neural Network Accelerator Design
Jung, Jueun
Lee, Kyuho Jason
[J]. 2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 272 - 276
[4] A Precision-Scalable Energy-Efficient Convolutional Neural Network Accelerator
Liu, Wenjian
Lin, Jun
Wang, Zhongfeng
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (10) : 3484 - 3497
[5] Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks
Kim, Ji-Hoon
Lee, Juhyoung
Lee, Jinsu
Heo, Jaehoon
Kim, Joo-Young
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (04) : 1093 - 1104
[6] Energy-Efficient Bit-Sparse Accelerator Design for Convolutional Neural Network
Xiao, Hang
Xu, Haobo
Wang, Ying
Li, Jiajun
Wang, Yujie
Han, Yinhe
[J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (07): : 1122 - 1131
[7] DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks
Duy-Thanh Nguyen
Bhattacharjee, Abhiroop
Moitra, Abhishek
Panda, Priyadarshini
[J]. 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
[8] Ascend: A Scalable and Energy-Efficient Deep Neural Network Accelerator With Photonic Interconnects
Li, Yuan
Wang, Ke
Zheng, Hao
Louri, Ahmed
Karanth, Avinash
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (07) : 2730 - 2741
[9] BitBlade: Area and Energy-Efficient Precision-Scalable Neural Network Accelerator with Bitwise Summation
Ryu, Sungju
Kim, Hyungjun
Yi, Wooseok
Kim, Jae-Joon
[J]. PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[10] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
Choi, Seungkyu
Sim, Jaehyeong
Kang, Myeonggu
Choi, Yeongjae
Kim, Hyeonuk
Kim, Lee-Sup
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702

← 1 2 3 4 5 →