Analyzing the Impact of DNN Hardware Accelerators-Oriented Compression Techniques on General-Purpose Low-End Boards

被引：0

作者：

Canzonieri, Giuliano ^{[1
]}

Monteleone, Salvatore ^{[2
]}

Palesi, Maurizio ^{[1
]}

Russo, Enrico ^{[1
]}

Patti, Davide ^{[1
]}

机构：

[1] Univ Catania, Dept Elect Elect & Comp Engn DIEEI, Catania, Italy

[2] Niccolo Cusano Univ, Dept Engn, Rome, Italy

来源：

MOBILE WEB AND INTELLIGENT INFORMATION SYSTEMS, MOBIWIS 2022 | 2022年 / 13475卷

关键词：

DNN; Compression techniques; Experimental implementation;

D O I：

10.1007/978-3-031-14391-5_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks emerged in the last years as the most promising approach to the smart processing of data. However, their effectiveness is still a challenge when they are implemented in resource-constrained architectures, such as those of edge devices often requiring at least the inference phase. This work investigates the impact of two different weight compression techniques initially designed and tested for DNN hardware accelerators in a scenario involving general-purpose low-end hardware. After applying several levels of weight compression on the MobileNet DNN model, we show how accelerator-oriented weight compression techniques can positively impact both memory traffic pressure and inference/latency figures, resulting in some cases in a good trade-off in terms of accuracy loss.

引用

页码：143 / 155

页数：13