Exploration of Automatic Mixed-Precision Search for Deep Neural Networks

被引:0
|
作者
Guo, Xuyang [1 ]
Huang, Yuanjun [2 ]
Cheng, Hsin-pai [3 ]
Li, Bing [3 ,5 ]
Wen, Wei [3 ]
Ma, Siyuan [4 ]
Li, Hai [3 ]
Chen, Yiran [3 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[3] Duke Univ, Durham, NC 27706 USA
[4] Xi An Jiao Tong Univ, Xian, Shaanxi, Peoples R China
[5] Army Res Off, Res Triangle Pk, NC 27709 USA
关键词
D O I
10.1109/aicas.2019.8771498
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited computation and storage resources, the weight quantization technique has been widely adopted. In practice, 8-bit or 16-bit quantization is mostly likely to be selected in order to maintain the accuracy at the same level as the models in 32-bit floating-point precision. Binary quantization, on the contrary, aims to obtain the highest compression at the cost of much bigger accuracy drop. Applying different precision in different layers/structures can potentially produce the most efficient model. Seeking for the best precision configuration, however, is difficult. In this work, we proposed an automatic search algorithm to address the challenge. By relaxing the search space of quantization bitwidth from discrete to continuous domain, our algorithm can generate a mixed-precision quantization scheme which achieves the compression rate close to the one from the binary-weighted model while maintaining the testing accuracy similar to the original full-precision model.
引用
收藏
页码:276 / 278
页数:3
相关论文
共 50 条
  • [1] Rethinking Differentiable Search for Mixed-Precision Neural Networks
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2346 - 2355
  • [2] Hardware for Quantized Mixed-Precision Deep Neural Networks
    Rios, Andres
    Nava, Patricia
    [J]. PROCEEDINGS OF THE 2022 15TH IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2022), 2022,
  • [3] Automatic Mixed-Precision Quantization Search of BERT
    Zhao, Changsheng
    Hua, Ting
    Shen, Yilin
    Lou, Qian
    Jin, Hongxia
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3427 - 3433
  • [4] EVOLUTIONARY QUANTIZATION OF NEURAL NETWORKS WITH MIXED-PRECISION
    Liu, Zhenhua
    Zhang, Xinfeng
    Wang, Shanshe
    Ma, Siwei
    Gao, Wen
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2785 - 2789
  • [5] Evaluating the Impact of Mixed-Precision on Fault Propagation for Deep Neural Networks on GPUs
    Dos Santos, Fernando Fernandes
    Rech, Paolo
    Kritikakou, Angeliki
    Sentieys, Olivier
    [J]. 2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 327 - 327
  • [6] Mixed-precision architecture based on computational memory for training deep neural networks
    Nandakumar, S. R.
    Le Gallo, Manuel
    Boybat, Irem
    Rajendran, Bipin
    Sebastian, Abu
    Eleftheriou, Evangelos
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [7] HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
    Dong, Zhen
    Yao, Zhewei
    Gholami, Amir
    Mahoney, Michael W.
    Keutzer, Kurt
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 293 - 302
  • [8] AutoMPQ: Automatic Mixed-Precision Neural Network Search via Few-Shot Quantization Adapter
    Xu, Ke
    Shao, Xiangyang
    Tian, Ye
    Yang, Shangshang
    Zhang, Xingyi
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [9] Mixed-precision quantized neural networks with progressively decreasing bitwidth
    Chu, Tianshu
    Luo, Qin
    Yang, Jie
    Huang, Xiaolin
    [J]. PATTERN RECOGNITION, 2021, 111
  • [10] CASCADED MIXED-PRECISION NETWORKS
    Geng, Xue
    Lin, Jie
    Li, Shaohua
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 241 - 245