Exploration of Automatic Mixed-Precision Search for Deep Neural Networks

被引：0

作者：

Guo, Xuyang ^{[1
]}

Huang, Yuanjun ^{[2
]}

Cheng, Hsin-pai ^{[3
]}

Li, Bing ^{[3
,5
]}

Wen, Wei ^{[3
]}

Ma, Siyuan ^{[4
]}

Li, Hai ^{[3
]}

Chen, Yiran ^{[3
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[3] Duke Univ, Durham, NC 27706 USA

[4] Xi An Jiao Tong Univ, Xian, Shaanxi, Peoples R China

[5] Army Res Off, Res Triangle Pk, NC 27709 USA

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019) | 2019年

关键词：

D O I：

10.1109/aicas.2019.8771498

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited computation and storage resources, the weight quantization technique has been widely adopted. In practice, 8-bit or 16-bit quantization is mostly likely to be selected in order to maintain the accuracy at the same level as the models in 32-bit floating-point precision. Binary quantization, on the contrary, aims to obtain the highest compression at the cost of much bigger accuracy drop. Applying different precision in different layers/structures can potentially produce the most efficient model. Seeking for the best precision configuration, however, is difficult. In this work, we proposed an automatic search algorithm to address the challenge. By relaxing the search space of quantization bitwidth from discrete to continuous domain, our algorithm can generate a mixed-precision quantization scheme which achieves the compression rate close to the one from the binary-weighted model while maintaining the testing accuracy similar to the original full-precision model.

引用

页码：276 / 278

页数：3

共 50 条

[41] Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks
Doerrich, Marion
Fan, Mingcheng
Kist, Andreas M.
[J]. IEEE ACCESS, 2023, 11 : 57627 - 57634
[42] Campo: Cost-Aware Performance Optimization for Mixed-Precision Neural Network Training
He, Xin
Sun, Jianhua
Chen, Hao
Li, Dong
[J]. PROCEEDINGS OF THE 2022 USENIX ANNUAL TECHNICAL CONFERENCE, 2022, : 505 - 518
[43] A 12.1 TOPS/W Mixed-precision Quantized Deep Convolutional Neural Network Accelerator for Low Power on Edge / Endpoint Device
Isono, Takanori
Yamakura, Makoto
Shimaya, Satoshi
Kawamoto, Isao
Tsuboi, Nobuhiro
Mineo, Masaaki
Nakajima, Wataru
Ishida, Kenichi
Sasaki, Shin
Higuchi, Toshio
Hoshaku, Masahiro
Murakami, Daisuke
Iwasaki, Toshifumi
Hirai, Hiroshi
[J]. 2020 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC), 2020,
[44] Mixed-Precision Kernel Recursive Least Squares
Lee, JunKyu
Nikolopoulos, Dimitrios S.
Vandierendonck, Hans
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) : 1284 - 1298
[45] DPQ: dynamic pseudo-mean mixed-precision quantization for pruned neural network
Pei, Songwen
Wang, Jiyao
Zhang, Bingxue
Qin, Wei
Xue, Hai
Ye, Xiaochun
Chen, Mingsong
[J]. MACHINE LEARNING, 2024, 113 (07) : 4099 - 4112
[46] Low-latency Buffering for Mixed-precision Neural Network Accelerator with MulTAP and FQPipe
Li, Yike
Wang, Zheng
Ou, Wenhui
Liang, Chen
Zhou, Weiyu
Yang, Yongkui
Chen, Chao
[J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[47] TAB: Unified and Optimized Ternary, Binary, and Mixed-precision Neural Network Inference on the Edge
Zhu, Shien
Duong, Luan H. K.
Liu, Weichen
[J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (05)
[48] Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance
Tang, Chen
Ouyang, Kai
Wang, Zhi
Zhu, Yifei
Ji, Wen
Wang, Yaowei
Zhu, Wenwu
[J]. COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 259 - 275
[49] Search for deep graph neural networks
Feng, Guosheng
Wang, Hongzhi
Wang, Chunnan
[J]. INFORMATION SCIENCES, 2023, 649
[50] Hierarchical Mixed-Precision Post-Training Quantization for SAR Ship Detection Networks
Wei, Hang
Wang, Zulin
Ni, Yuanhan
[J]. Remote Sensing, 2024, 16 (21)

← 1 2 3 4 5 →