UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

被引:212
|
作者
Lee, Jinmook [1 ]
Kim, Changhyeon [1 ]
Kang, Sanghoon [1 ]
Shin, Dongjoo [1 ]
Kim, Sangyeob [1 ]
Yoo, Hoi-Jun [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South Korea
基金
新加坡国家研究基金会;
关键词
Bit serial; deep learning; deep learning ASIC; deep learning hardware; deep neural network (DNN); mobile deep learning;
D O I
10.1109/JSSC.2018.2865489
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An energy-efficient deep neural network (DNN) accelerator, unified neural processing unit (UNPU), is proposed for mobile deep learning applications. The UNPU can support both convolutional layers (CLs) and recurrent or fully connected layers (FCLs) to support versatile workload combinations to accelerate various mobile deep learning applications. In addition, the UNPU is the first DNN accelerator ASIC that can support fully variable weight bit precision from 1 to 16 bit. It enables the UNPU to operate on the accuracy-energy optimal point. Moreover, the lookup table (LUT)-based bit-serial processing element (LBPE) in the UNPU achieves the energy consumption reduction compared to the conventional fixed-point multiply-and-accumulate (MAC) array by 23.1%, 27.2%, 41%, and 53.6% for the 16-, 8-, 4-, and 1-bit weight precision, respectively. Besides the energy efficiency improvement, the unified DNN core architecture of the UNPU improves the peak performance for CL by 1.15x compared to the previous work. It makes the UNPU operate on the lower voltage and frequency for the given DNN to increase energy efficiency. The UNPU is implemented in 65-nm CMOS technology and occupies the 4 x4 mm(2) die area. The UNPU can operates from 0.63-to 1.1-V supply voltage with maximum frequency of 200 MHz. The UNPU has peak performance of 345.6 GOPS for 16-bit weight precision and 7372 GOPS for 1-bit weight precision. The wide operating range of UNPU makes the UNPU achieve the power efficiency of 3.08 TOPS/W for 16-bit weight precision and 50.6 TOPS/W for 1-bit weight precision. The functionality of the UNPU is successfully demonstrated on the verification system using ImageNet deep CNN (VGG-16).
引用
收藏
页码:173 / 185
页数:13
相关论文
共 50 条
  • [1] UNPU: A 50.6TOPS/W Unified Deep Neural Network Accelerator with 1b-to-16b Fully-Variable Weight Bit-Precision
    Lee, Jinmook
    Kim, Changhyeon
    Kang, Sanghoon
    Shin, Dongjoo
    Kim, Sangyeob
    Yoo, Hoi-Jun
    [J]. 2018 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE - (ISSCC), 2018, : 218 - +
  • [2] BitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks
    Ryu, Sungju
    Kim, Hyungjun
    Yi, Wooseok
    Kim, Eunhwan
    Kim, Yulhwa
    Kim, Taesu
    Kim, Jae-Joon
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (06) : 1924 - 1935
  • [3] An Energy-Efficient Deep Neural Network Accelerator Design
    Jung, Jueun
    Lee, Kyuho Jason
    [J]. 2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 272 - 276
  • [4] A Precision-Scalable Energy-Efficient Convolutional Neural Network Accelerator
    Liu, Wenjian
    Lin, Jun
    Wang, Zhongfeng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (10) : 3484 - 3497
  • [5] Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks
    Kim, Ji-Hoon
    Lee, Juhyoung
    Lee, Jinsu
    Heo, Jaehoon
    Kim, Joo-Young
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (04) : 1093 - 1104
  • [6] Energy-Efficient Bit-Sparse Accelerator Design for Convolutional Neural Network
    Xiao, Hang
    Xu, Haobo
    Wang, Ying
    Li, Jiajun
    Wang, Yujie
    Han, Yinhe
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (07): : 1122 - 1131
  • [7] DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks
    Duy-Thanh Nguyen
    Bhattacharjee, Abhiroop
    Moitra, Abhishek
    Panda, Priyadarshini
    [J]. 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [8] Ascend: A Scalable and Energy-Efficient Deep Neural Network Accelerator With Photonic Interconnects
    Li, Yuan
    Wang, Ke
    Zheng, Hao
    Louri, Ahmed
    Karanth, Avinash
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (07) : 2730 - 2741
  • [9] BitBlade: Area and Energy-Efficient Precision-Scalable Neural Network Accelerator with Bitwise Summation
    Ryu, Sungju
    Kim, Hyungjun
    Yi, Wooseok
    Kim, Jae-Joon
    [J]. PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [10] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Choi, Seungkyu
    Sim, Jaehyeong
    Kang, Myeonggu
    Choi, Yeongjae
    Kim, Hyeonuk
    Kim, Lee-Sup
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702