An Energy-Efficient Sparse Deep-Neural-Network Learning Accelerator With Fine-Grained Mixed Precision of FP8-FP16

被引:14
|
作者
Lee, Jinsu [1 ]
Lee, Juhyoung [1 ]
Han, Donghyeon [1 ]
Lee, Jinmook [1 ]
Park, Gwangtae [1 ]
Yoo, Hoi-Jun [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Elect Engn, Daejeon 34141, South Korea
来源
IEEE SOLID-STATE CIRCUITS LETTERS | 2019年 / 2卷 / 11期
基金
新加坡国家研究基金会;
关键词
Accelerators; deep learning (DL); deep-neural network (DNN); energy efficient; training;
D O I
10.1109/LSSC.2019.2937440
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, several hardware have been reported for deep-neural-network (DNN) acceleration, however, they focused on only inference rather than DNN learning that is crucial ingredient for user adaptation at the edge-device as well as transfer learning with domain-specific data. However, DNN learning requires much heavier floating-point (FP) computation and memory access than DNN inference, thus, dedicated DNN learning hardware is essential. In this letter, we present an energy-efficient DNN learning accelerator core supporting CNN and FC learning as well as inference with following three key features: 1) fine-grained mixed precision (FGMP); 2) compressed sparse DNN learning/inference; and 3) input load balancer. As a result, energy efficiency is improved 1.76x compared to sparse FP16 operation without any degradation of learning accuracy. The energy efficiency is 4.9x higher than NVIDIA V100 GPU and its normalized peak performance is 3.47x higher than previous DNN learning processor.
引用
收藏
页码:232 / 235
页数:4
相关论文
共 7 条
  • [1] LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16
    Lee, Jinsu
    Lee, Juhyoung
    Han, Donghyeon
    Lee, Jinmook
    Park, Gwangtae
    Yoo, Hoi-Jun
    [J]. 2019 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2019, 62 : 142 - +
  • [2] Energy-efficient Oriented Approximate Quantization Scheme for Fine-Grained Sparse Neural Network Acceleration
    Yu, Tianyang
    Wu, Bi
    Chen, Ke
    Yan, Chenggang
    Liu, Weiqiang
    [J]. 2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 762 - 769
  • [3] SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning
    Sunny, Febin
    Nikdast, Mandi
    Pasricha, Sudeep
    [J]. 27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 214 - 219
  • [4] An Energy-Efficient Fine-Grained Deep Neural Network Partitioning Scheme for Wireless Collaborative Fog Computing
    Kilcioglu, Emre
    Mirghasemi, Hamed
    Stupia, Ivan
    Vandendorpe, Luc
    [J]. IEEE ACCESS, 2021, 9 : 79611 - 79627
  • [5] UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision
    Lee, Jinmook
    Kim, Changhyeon
    Kang, Sanghoon
    Shin, Dongjoo
    Kim, Sangyeob
    Yoo, Hoi-Jun
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 173 - 185
  • [6] An Energy-Efficient Deep Reinforcement Learning FPGA Accelerator for Online Fast Adaptation with Selective Mixed-precision Re-training
    Jo, Wooyoung
    Lee, Juhyoung
    Park, Seunghyun
    Yoo, Hoi-Jun
    [J]. IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC 2021), 2021,
  • [7] A 28nm Energy-Area-Efficient Row-based pipelined Training Accelerator with Mixed FXP4/FP16 for On-Device Transfer Learning
    Lu, Wei
    Pei, Han-Hsiang
    Yu, Jheng-Rong
    Chen, Hung-Ming
    Huang, Po-Tsang
    [J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,