Quantized Deep Neural Networks for Energy Efficient Hardware-based Inference

被引:0
|
作者
Ding, Ruizhou [1 ]
Liu, Zeye [1 ]
Blanton, R. D. [1 ]
Marculescu, Diana [1 ]
机构
[1] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks (DNNs) have been adopted in many systems because of their higher classification accuracy, with custom hardware implementations great candidates for highspeed, accurate inference. While progress in achieving large scale, highly accurate DNNs has been made, significant energy and area are required due to massive memory accesses and computations. Such demands pose a challenge to any DNN implementation, yet it is more natural to handle in a custom hardware platform. To alleviate the increased demand in storage and energy, quantized DNNs constrain their weights (and activations) from floatingpoint numbers to only a few discrete levels. Therefore, storage is reduced, thereby leading to less memory accesses. In this paper, we provide an overview of different types of quantized DNNs, as well as the training approaches for them. Among the various quantized DNNs, our LightNN (Light Neural Network) approach can reduce both memory accesses and computation energy, by filling the gap between classic, full-precision and binarized DNNs. We provide a detailed comparison between LightNNs, conventional DNNs and Binarized Neural Networks (BNNs), with MNIST and CIFAR-10 datasets. In contrast to other quantized DNNs that trade-off significant amounts of accuracy for lower memory requirements, LightNNs can significantly reduce storage, energy and area while still maintaining a test error similar to a large DNN configuration. Thus, LightNNs provide more options for hardware designers to trade-off accuracy and energy.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Arzel, Matthieu
    Farrugia, Nicolas
    Bengio, Yoshua
    [J]. 2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 206 - 209
  • [2] A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks
    Sreehari, R.
    Deepu, Vijayasenan
    Arulalan, M. R.
    [J]. EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 1079 - 1091
  • [3] Efficient Hardware Acceleration for Approximate Inference of Bitwise Deep Neural Networks
    Vogel, Sebastian
    Guntoro, Andre
    Ascheid, Gerd
    [J]. 2017 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING (DASIP), 2017,
  • [4] EBSP: Evolving Bit Sparsity Patterns for Hardware-Friendly Inference of Quantized Deep Neural Networks
    Liu, Fangxin
    Zhao, Wenbo
    Wang, Zongwu
    Chen, Yongbiao
    He, Zhezhi
    Jing, Naifeng
    Liang, Xiaoyao
    Jiang, Li
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 259 - 264
  • [5] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    [J]. PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [6] Inference and Energy Efficient Design of Deep Neural Networks for Embedded Devices
    Galanis, Ioannis
    Anagnostopoulos, Iraklis
    Nguyen, Chinh
    Bares, Guillermo
    Burkard, Dona
    [J]. 2020 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2020), 2020, : 36 - 41
  • [7] Adaptive learning rule for hardware-based deep neural networks using electronic synapse devices
    Suhwan Lim
    Jong-Ho Bae
    Jai-Ho Eum
    Sungtae Lee
    Chul-Heung Kim
    Dongseok Kwon
    Byung-Gook Park
    Jong-Ho Lee
    [J]. Neural Computing and Applications, 2019, 31 : 8101 - 8116
  • [8] Adaptive learning rule for hardware-based deep neural networks using electronic synapse devices
    Lim, Suhwan
    Bae, Jong-Ho
    Eum, Jai-Ho
    Lee, Sungtae
    Kim, Chul-Heung
    Kwon, Dongseok
    Park, Byung-Gook
    Lee, Jong-Ho
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (11): : 8101 - 8116
  • [9] A Pipelined Energy-efficient Hardware Accelaration for Deep Convolutional Neural Networks
    Alaeddine, Hmidi
    Jihene, Malek
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON DESIGN & TEST OF INTEGRATED MICRO & NANO-SYSTEMS (DTS), 2019,
  • [10] FLightNNs: Lightweight Quantized Deep Neural Networks for Fast and Accurate Inference
    Ding, Ruizhou
    Liu, Zeye
    Chin, Ting-Wu
    Marculescu, Diana
    Blanton, R. D.
    [J]. PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,