Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks

被引：2

作者：

Doerrich, Marion ^{[1
]}

Fan, Mingcheng ^{[1
]}

Kist, Andreas M. ^{[1
]}

机构：

[1] Friedrich Alexander Univ Erlangen Nurnberg, Dept Artificial Intelligence Biomed Engn, D-91052 Erlangen, Germany

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Deep learning; green AI; energy efficiency; mixed precision training; quantization; edge TPU;

D O I：

10.1109/ACCESS.2023.3284388

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the deep learning community, increasingly large models are being developed, leading to rapidly growing computational costs and energy costs. Recently, a new trend has been arising, advocating that researchers should also report the energy efficiency besides their model's performance in their papers. Previous research has shown that reduced precision can be helpful to improve energy efficiency. Based on this finding, we propose a simple practice to effectively improve the energy efficiency of training and inference, i.e., training the model with mixed precision and deploying it on Edge TPUs. We evaluated its effectiveness by comparing the speed-up of a state-of-the-art semantic segmentation architecture with respect to different typical usage scenarios, including using different devices, deep learning frameworks, model sizes, and batch sizes. Our results show that enabled mixed precision can gain up to a $1.9\times $ speedup compared to the most common and default float32 data type on GPUs. Deploying the models on Edge TPU further boosted the inference by a factor of 6. Our approach allows researchers to accelerate their training and inference procedures without jeopardizing the model's accuracy, meanwhile reducing energy consumption and electricity cost easily without changing their model architecture or retraining. Furthermore, our approach is helpful in reducing the carbon footprint used to train and deploy the neural network and thus has a positive effect on environmental resources.

引用

页码：57627 / 57634

页数：8

共 50 条

[41] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
Ghoshal, Arnab
Swietojanski, Pawel
Renals, Steve
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
[42] Training deep quantum neural networks
Kerstin Beer
Dmytro Bondarenko
Terry Farrelly
Tobias J. Osborne
Robert Salzmann
Daniel Scheiermann
Ramona Wolf
[J]. Nature Communications, 11
[43] NOISY TRAINING FOR DEEP NEURAL NETWORKS
Meng, Xiangtao
Liu, Chao
Zhang, Zhiyong
Wang, Dong
[J]. 2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 16 - 20
[44] Training deep quantum neural networks
Beer, Kerstin
Bondarenko, Dmytro
Farrelly, Terry
Osborne, Tobias J.
Salzmann, Robert
Scheiermann, Daniel
Wolf, Ramona
[J]. NATURE COMMUNICATIONS, 2020, 11 (01)
[45] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Hubara, Itay
Courbariaux, Matthieu
Soudry, Daniel
El-Yaniv, Ran
Bengio, Yoshua
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
[46] Post Training Mixed Precision Quantization of Neural Networks using First-Order Information
Chauhan, Arun
Tiwari, Utsav
Vikram, N. R.
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1335 - 1344
[47] New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and Inference
Zhang, Hao
Chen, Dongdong
Ko, Seok-Bum
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (01) : 26 - 38
[48] Training Deep Neural Networks with Low Precision Input Data: A Hurricane Prediction Case Study
Kahira, Albert
Bautista Gomez, Leonardo
Badia, Rosa M.
[J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 11203 : 562 - 569
[49] Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks
Sainath, Tara N.
Kingsbury, Brian
Soltau, Hagen
Ramabhadran, Bhuvana
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (11): : 2267 - 2276
[50] Parameter inference with deep jointly informed neural networks
Humbird, Kelli D.
Peterson, J. Luc
McClarren, Ryan G.
[J]. STATISTICAL ANALYSIS AND DATA MINING, 2019, 12 (06) : 496 - 504

← 1 2 3 4 5 →