An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning Level

被引:1
|
作者
Maleki, Mohammad-Ali [1 ]
Nabipour-Meybodi, Alireza [1 ]
Kamal, Mehdi [1 ]
Afzali-Kusha, Ali [1 ,2 ]
Pedram, Massoud [3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran 1996715433, Iran
[2] Inst Res Fundamental Sci, Tehran 1953833511, Iran
[3] Univ Southern Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
关键词
Pruned neural network; online management; energy efficiency; DNN; image classification;
D O I
10.1145/3460972
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present a low-energy inference method for convolutional neural networks in image classification applications. The lower energy consumption is achieved by using a highly pruned (lower-energy) network if the resulting network can provide a correct output. More specifically, the proposed inference method makes use of two pruned neural networks (NNs), namely mildly and aggressively pruned networks, which are both designed offline. In the system, a third NN makes use of the input data for the online selection of the appropriate pruned network. The third network, for its feature extraction, employs the same convolutional layers as those of the aggressively pruned NN, thereby reducing the overhead of the online management. There is some accuracy loss induced by the proposed method where, for a given level of accuracy, the energy gain of the proposed method is considerably larger than the case of employing any one pruning level. The proposed method is independent of both the pruning method and the network architecture. The efficacy of the proposed inference method is assessed on Eyeriss hardware accelerator platform for some of the state-ofthe-art NN architectures. Our studies show that this method may provide, on average, 70% energy reduction compared to the original NN at the cost of about 3% accuracy loss on the CIFAR-10 dataset.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning
    Yang, Tien-Ju
    Chen, Yu-Hsin
    Sze, Vivienne
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6071 - 6079
  • [2] Selective Pruning of Sparsity-Supported Energy-Efficient Accelerator for Convolutional Neural Networks
    Liu, Chia-Chi
    Zhang, Xuezhi
    Wey, I-Chyn
    Teo, T. Hui
    [J]. 2023 IEEE 16TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP, MCSOC, 2023, : 454 - 461
  • [3] A Spectral Clustering Based Filter-Level Pruning Method for Convolutional Neural Networks
    Li, Lianqiang
    Zhu, Jie
    Sun, Ming-Ting
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) : 2624 - 2627
  • [4] TraNNsformer: Clustered Pruning on Crossbar-Based Architectures for Energy-Efficient Neural Networks
    Ankit, Aayush
    Ibrayev, Timur
    Sengupta, Abhronil
    Roy, Kaushik
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2361 - 2374
  • [5] ROSETTA: A Resource and Energy-Efficient Inference Processor for Recurrent Neural Networks Based on Programmable Data Formats and Fine Activation Pruning
    Kim, Jiho
    Kim, Tae-Hwan
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (03) : 650 - 663
  • [6] Pruning Deep Neural Networks for Green Energy-Efficient Models: A Survey
    Tmamna, Jihene
    Ben Ayed, Emna
    Fourati, Rahma
    Gogate, Mandar
    Arslan, Tughrul
    Hussain, Amir
    Ayed, Mounir Ben
    [J]. COGNITIVE COMPUTATION, 2024, : 2931 - 2952
  • [7] SRAM Voltage Scaling for Energy-Efficient Convolutional Neural Networks
    Yang, Lita
    Murmann, Boris
    [J]. PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2017, : 7 - 12
  • [8] An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks
    Wang, Yizhi
    Lin, Jun
    Wang, Zhongfeng
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (02) : 280 - 293
  • [9] Entropy-based pruning method for convolutional neural networks
    Hur, Cheonghwan
    Kang, Sanggil
    [J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (06): : 2950 - 2963
  • [10] Entropy-based pruning method for convolutional neural networks
    Cheonghwan Hur
    Sanggil Kang
    [J]. The Journal of Supercomputing, 2019, 75 : 2950 - 2963