Exploration of Automatic Mixed-Precision Search for Deep Neural Networks

被引:0
|
作者
Guo, Xuyang [1 ]
Huang, Yuanjun [2 ]
Cheng, Hsin-pai [3 ]
Li, Bing [3 ,5 ]
Wen, Wei [3 ]
Ma, Siyuan [4 ]
Li, Hai [3 ]
Chen, Yiran [3 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[3] Duke Univ, Durham, NC 27706 USA
[4] Xi An Jiao Tong Univ, Xian, Shaanxi, Peoples R China
[5] Army Res Off, Res Triangle Pk, NC 27709 USA
关键词
D O I
10.1109/aicas.2019.8771498
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited computation and storage resources, the weight quantization technique has been widely adopted. In practice, 8-bit or 16-bit quantization is mostly likely to be selected in order to maintain the accuracy at the same level as the models in 32-bit floating-point precision. Binary quantization, on the contrary, aims to obtain the highest compression at the cost of much bigger accuracy drop. Applying different precision in different layers/structures can potentially produce the most efficient model. Seeking for the best precision configuration, however, is difficult. In this work, we proposed an automatic search algorithm to address the challenge. By relaxing the search space of quantization bitwidth from discrete to continuous domain, our algorithm can generate a mixed-precision quantization scheme which achieves the compression rate close to the one from the binary-weighted model while maintaining the testing accuracy similar to the original full-precision model.
引用
收藏
页码:276 / 278
页数:3
相关论文
共 50 条
  • [41] Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks
    Doerrich, Marion
    Fan, Mingcheng
    Kist, Andreas M.
    [J]. IEEE ACCESS, 2023, 11 : 57627 - 57634
  • [42] Campo: Cost-Aware Performance Optimization for Mixed-Precision Neural Network Training
    He, Xin
    Sun, Jianhua
    Chen, Hao
    Li, Dong
    [J]. PROCEEDINGS OF THE 2022 USENIX ANNUAL TECHNICAL CONFERENCE, 2022, : 505 - 518
  • [43] A 12.1 TOPS/W Mixed-precision Quantized Deep Convolutional Neural Network Accelerator for Low Power on Edge / Endpoint Device
    Isono, Takanori
    Yamakura, Makoto
    Shimaya, Satoshi
    Kawamoto, Isao
    Tsuboi, Nobuhiro
    Mineo, Masaaki
    Nakajima, Wataru
    Ishida, Kenichi
    Sasaki, Shin
    Higuchi, Toshio
    Hoshaku, Masahiro
    Murakami, Daisuke
    Iwasaki, Toshifumi
    Hirai, Hiroshi
    [J]. 2020 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC), 2020,
  • [44] Mixed-Precision Kernel Recursive Least Squares
    Lee, JunKyu
    Nikolopoulos, Dimitrios S.
    Vandierendonck, Hans
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) : 1284 - 1298
  • [45] DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network
    Pei, Songwen
    Wang, Jiyao
    Zhang, Bingxue
    Qin, Wei
    Xue, Hai
    Ye, Xiaochun
    Chen, Mingsong
    [J]. MACHINE LEARNING, 2024, 113 (07) : 4099 - 4112
  • [46] Low-latency Buffering for Mixed-precision Neural Network Accelerator with MulTAP and FQPipe
    Li, Yike
    Wang, Zheng
    Ou, Wenhui
    Liang, Chen
    Zhou, Weiyu
    Yang, Yongkui
    Chen, Chao
    [J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [47] TAB: Unified and Optimized Ternary, Binary, and Mixed-precision Neural Network Inference on the Edge
    Zhu, Shien
    Duong, Luan H. K.
    Liu, Weichen
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (05)
  • [48] Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance
    Tang, Chen
    Ouyang, Kai
    Wang, Zhi
    Zhu, Yifei
    Ji, Wen
    Wang, Yaowei
    Zhu, Wenwu
    [J]. COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 259 - 275
  • [49] Search for deep graph neural networks
    Feng, Guosheng
    Wang, Hongzhi
    Wang, Chunnan
    [J]. INFORMATION SCIENCES, 2023, 649
  • [50] Hierarchical Mixed-Precision Post-Training Quantization for SAR Ship Detection Networks
    Wei, Hang
    Wang, Zulin
    Ni, Yuanhan
    [J]. Remote Sensing, 2024, 16 (21)