Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks

被引:2
|
作者
Doerrich, Marion [1 ]
Fan, Mingcheng [1 ]
Kist, Andreas M. [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg, Dept Artificial Intelligence Biomed Engn, D-91052 Erlangen, Germany
关键词
Deep learning; green AI; energy efficiency; mixed precision training; quantization; edge TPU;
D O I
10.1109/ACCESS.2023.3284388
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the deep learning community, increasingly large models are being developed, leading to rapidly growing computational costs and energy costs. Recently, a new trend has been arising, advocating that researchers should also report the energy efficiency besides their model's performance in their papers. Previous research has shown that reduced precision can be helpful to improve energy efficiency. Based on this finding, we propose a simple practice to effectively improve the energy efficiency of training and inference, i.e., training the model with mixed precision and deploying it on Edge TPUs. We evaluated its effectiveness by comparing the speed-up of a state-of-the-art semantic segmentation architecture with respect to different typical usage scenarios, including using different devices, deep learning frameworks, model sizes, and batch sizes. Our results show that enabled mixed precision can gain up to a $1.9\times $ speedup compared to the most common and default float32 data type on GPUs. Deploying the models on Edge TPU further boosted the inference by a factor of 6. Our approach allows researchers to accelerate their training and inference procedures without jeopardizing the model's accuracy, meanwhile reducing energy consumption and electricity cost easily without changing their model architecture or retraining. Furthermore, our approach is helpful in reducing the carbon footprint used to train and deploy the neural network and thus has a positive effect on environmental resources.
引用
收藏
页码:57627 / 57634
页数:8
相关论文
共 50 条
  • [41] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
    Ghoshal, Arnab
    Swietojanski, Pawel
    Renals, Steve
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
  • [42] Training deep quantum neural networks
    Kerstin Beer
    Dmytro Bondarenko
    Terry Farrelly
    Tobias J. Osborne
    Robert Salzmann
    Daniel Scheiermann
    Ramona Wolf
    [J]. Nature Communications, 11
  • [43] NOISY TRAINING FOR DEEP NEURAL NETWORKS
    Meng, Xiangtao
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    [J]. 2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 16 - 20
  • [44] Training deep quantum neural networks
    Beer, Kerstin
    Bondarenko, Dmytro
    Farrelly, Terry
    Osborne, Tobias J.
    Salzmann, Robert
    Scheiermann, Daniel
    Wolf, Ramona
    [J]. NATURE COMMUNICATIONS, 2020, 11 (01)
  • [45] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
    Hubara, Itay
    Courbariaux, Matthieu
    Soudry, Daniel
    El-Yaniv, Ran
    Bengio, Yoshua
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [46] Post Training Mixed Precision Quantization of Neural Networks using First-Order Information
    Chauhan, Arun
    Tiwari, Utsav
    Vikram, N. R.
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1335 - 1344
  • [47] New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and Inference
    Zhang, Hao
    Chen, Dongdong
    Ko, Seok-Bum
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (01) : 26 - 38
  • [48] Training Deep Neural Networks with Low Precision Input Data: A Hurricane Prediction Case Study
    Kahira, Albert
    Bautista Gomez, Leonardo
    Badia, Rosa M.
    [J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 11203 : 562 - 569
  • [49] Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks
    Sainath, Tara N.
    Kingsbury, Brian
    Soltau, Hagen
    Ramabhadran, Bhuvana
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (11): : 2267 - 2276
  • [50] Parameter inference with deep jointly informed neural networks
    Humbird, Kelli D.
    Peterson, J. Luc
    McClarren, Ryan G.
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2019, 12 (06) : 496 - 504