EVOLUTIONARY QUANTIZATION OF NEURAL NETWORKS WITH MIXED-PRECISION

被引：10

作者：

Liu, Zhenhua ^{[1
]}

Zhang, Xinfeng ^{[2
]}

Wang, Shanshe ^{[1
,3
]}

Ma, Siwei ^{[1
,3
]}

Gao, Wen ^{[1
,3
]}

机构：

[1] Peking Univ, Sch Elect Engn & Comp Sci, Inst Digital Media, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing, Peoples R China

[3] Peking Univ, Informat Technol R&D Innovat Ctr, Shaoxing 31200, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

基金：

中国国家自然科学基金;

关键词：

Quantizaiton; Mixed-precision; Deep Neural Networks;

D O I：

10.1109/ICASSP39728.2021.9413631

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Quantization is an effective way for reducing the memory and computation costs of deep neural networks. Most of existing methods exploit the fixed-precision quantization approach, e.g., weights and activations (i.e., output features) are represented as 8-bit values. Although mixed-precision quantization provides us a greater possibility to efficiently allocate computation resources and maintain the network performance, it is difficult to accurately solve the optimal bit-width of each layer. In this paper, we develop a novel evolutionary based method to automatically determine the bit-widths of weights and activations in each convolutional layer, namely, Evolutionary Mixed-Precision Quantization (EMQ). Specifically, the quantization intervals of weights and activations of all layers in the given network will be simultaneously encoded as an individual. The fitness of each individual is calculated as the performance of the corresponding quantized network. The optimal quantization result will be updated and elected during the evolutionary search. Extensive experiments conducted on benchmark datasets and models demonstrate the effectiveness of the proposed method over the state-of-the-art network quantization algorithms.

引用

页码：2785 / 2789

页数：5

共 50 条

[1] HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Dong, Zhen
Yao, Zhewei
Gholami, Amir
Mahoney, Michael W.
Keutzer, Kurt
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 293 - 302
[2] Mixed-precision quantization-aware training for photonic neural networks
Manos Kirtas
Nikolaos Passalis
Athina Oikonomou
Miltos Moralis-Pegios
George Giamougiannis
Apostolos Tsakyridis
George Mourgias-Alexandris
Nikolaos Pleros
Anastasios Tefas
[J]. Neural Computing and Applications, 2023, 35 : 21361 - 21379
[3] Mixed-precision quantization-aware training for photonic neural networks
Kirtas, Manos
Passalis, Nikolaos
Oikonomou, Athina
Moralis-Pegios, Miltos
Giamougiannis, George
Tsakyridis, Apostolos
Mourgias-Alexandris, George
Pleros, Nikolaos
Tefas, Anastasios
[J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
[4] Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Chen, Weihan
Wang, Peisong
Cheng, Jian
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5330 - 5339
[5] Mixed-precision quantization for neural networks based on error limit (Invited)
Li, Yiduo
Guo, Zibo
Liu, Kai
Sun, Xiaoyao
[J]. Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2022, 51 (04):
[6] Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks
Vasquez, Karina
Venkatesha, Yeshwanth
Bhattacharjee, Abhiroop
Moitra, Abhishek
Panda, Priyadarshini
[J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1360 - 1365
[7] Joint Optimization of Dimension Reduction and Mixed-Precision Quantization for Activation Compression of Neural Networks
Tai, Yu-Shan
Chang, Cheng-Yang
Teng, Chieh-Fang
Chen, Yi-Ta
Wu, An-Yeu
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 4025 - 4037
[8] Hardware for Quantized Mixed-Precision Deep Neural Networks
Rios, Andres
Nava, Patricia
[J]. PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
[9] Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates
Wang, Xuanda
Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai
[J]. 2023 DATA COMPRESSION CONFERENCE, DCC, 2023, : 371 - 371
[10] Rethinking Differentiable Search for Mixed-Precision Neural Networks
Cai, Zhaowei
Vasconcelos, Nuno
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2346 - 2355

← 1 2 3 4 5 →