EVOLUTIONARY QUANTIZATION OF NEURAL NETWORKS WITH MIXED-PRECISION

被引:10
|
作者
Liu, Zhenhua [1 ]
Zhang, Xinfeng [2 ]
Wang, Shanshe [1 ,3 ]
Ma, Siwei [1 ,3 ]
Gao, Wen [1 ,3 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Inst Digital Media, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing, Peoples R China
[3] Peking Univ, Informat Technol R&D Innovat Ctr, Shaoxing 31200, Peoples R China
基金
中国国家自然科学基金;
关键词
Quantizaiton; Mixed-precision; Deep Neural Networks;
D O I
10.1109/ICASSP39728.2021.9413631
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Quantization is an effective way for reducing the memory and computation costs of deep neural networks. Most of existing methods exploit the fixed-precision quantization approach, e.g., weights and activations (i.e., output features) are represented as 8-bit values. Although mixed-precision quantization provides us a greater possibility to efficiently allocate computation resources and maintain the network performance, it is difficult to accurately solve the optimal bit-width of each layer. In this paper, we develop a novel evolutionary based method to automatically determine the bit-widths of weights and activations in each convolutional layer, namely, Evolutionary Mixed-Precision Quantization (EMQ). Specifically, the quantization intervals of weights and activations of all layers in the given network will be simultaneously encoded as an individual. The fitness of each individual is calculated as the performance of the corresponding quantized network. The optimal quantization result will be updated and elected during the evolutionary search. Extensive experiments conducted on benchmark datasets and models demonstrate the effectiveness of the proposed method over the state-of-the-art network quantization algorithms.
引用
收藏
页码:2785 / 2789
页数:5
相关论文
共 50 条
  • [1] HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
    Dong, Zhen
    Yao, Zhewei
    Gholami, Amir
    Mahoney, Michael W.
    Keutzer, Kurt
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 293 - 302
  • [2] Mixed-precision quantization-aware training for photonic neural networks
    Manos Kirtas
    Nikolaos Passalis
    Athina Oikonomou
    Miltos Moralis-Pegios
    George Giamougiannis
    Apostolos Tsakyridis
    George Mourgias-Alexandris
    Nikolaos Pleros
    Anastasios Tefas
    [J]. Neural Computing and Applications, 2023, 35 : 21361 - 21379
  • [3] Mixed-precision quantization-aware training for photonic neural networks
    Kirtas, Manos
    Passalis, Nikolaos
    Oikonomou, Athina
    Moralis-Pegios, Miltos
    Giamougiannis, George
    Tsakyridis, Apostolos
    Mourgias-Alexandris, George
    Pleros, Nikolaos
    Tefas, Anastasios
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
  • [4] Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
    Chen, Weihan
    Wang, Peisong
    Cheng, Jian
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5330 - 5339
  • [5] Mixed-precision quantization for neural networks based on error limit (Invited)
    Li, Yiduo
    Guo, Zibo
    Liu, Kai
    Sun, Xiaoyao
    [J]. Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2022, 51 (04):
  • [6] Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks
    Vasquez, Karina
    Venkatesha, Yeshwanth
    Bhattacharjee, Abhiroop
    Moitra, Abhishek
    Panda, Priyadarshini
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1360 - 1365
  • [7] Joint Optimization of Dimension Reduction and Mixed-Precision Quantization for Activation Compression of Neural Networks
    Tai, Yu-Shan
    Chang, Cheng-Yang
    Teng, Chieh-Fang
    Chen, Yi-Ta
    Wu, An-Yeu
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 4025 - 4037
  • [8] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    [J]. PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [9] Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates
    Wang, Xuanda
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    [J]. 2023 DATA COMPRESSION CONFERENCE, DCC, 2023, : 371 - 371
  • [10] Rethinking Differentiable Search for Mixed-Precision Neural Networks
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2346 - 2355